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FOREWORD 


DEVELOPMENT  OF  THE  ARMED  FORCES  QUALIFICATION  TEST  AND 
PREDECESSOR  ARMY  SCREENING  TESTS,  1946-1950 
(Based  on  PRS  Report  970) 


The  Armed  Forces  Qualification  Test  (AFQT)  is  the  means  of  determin- 
ing mental  test  acceptability  of  potential  enlistees  and  inductees.  It  is  jointly 
developed  by  the  Armed  Forces  to  implement  the  mental  standards  established 
by  law  and  administrative  action  for  admission  into  the  Armed  Forces.  The 
AFQT  also  provides  a basis  for  qualitative  distribution  of  manpower  among  the 
Services.. 

This  report  is  a summary  of  the  work  done  and  the  problems  encountered 
in  the  development  and  use  of  the  AFQT,  It  describes  briefly  the  experience 
with  other  tests  used  for  initial  screening  and  related  purposes  in  lias  past.  It 
was  out  of  this  experience  that  the  current  AFQT  was  developed. 

The  test  was  developed  in  two  comparable  forms  in  accordance  with 
accepted  principles  and  techniques  of  test  construction  and  with  due  regard  for 
policy  and  operating  problems.  When  administered  accord4- ^ to  ptes-  - ♦ 
procedures,  the  AFQT  adequately  maintains  its  effectiveness  as  a mental  tost 
screen.  Continual  research  is  underway  to  maintain  and  improve  mental  test 
screening  procedures. 

This  report  is  of  interest  to  research  technicians,  and  to  those  responsi- 
ble for  or  concerned  with  initial  screening  in  the  Armed  Forces. 
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DEVELOPMENT  OF  THE  ARMED  FORCES  QUALIFICATION  TEST  AND 
PREDECESSOR  ARMY  SCREENING  TESTS,  1946- 1550 


INTRODUCTION 

Mental  test  standards  are  established  by  law  and  military  policy  a means  ol  deter- 
mining  acceptability  of  potential  enlistees  and  inductees.  The  purpose  of  such  standards  is 
to  screen  out  those  who  cannot  profit  from  military  training  and  who  might  be  actual  liabili- 
ties to  the  Services.  Such  standards  must  vary  from  time  to  time  in  accordance  with 
changes  in  manpower  demands,  training  facilities,  and  National  Policy. 

Psychological  screening  tests  are  used  to  examine  potential  enlistees  and  inductees 
for  acceptance  in  conformity  with  mental  standards.  Similar  tests  were  used  during  World 
War  n for  classification  purposes.  Experience  with  classification  tests  indicated  their 
value  for  initial  screening.  Constant  improvement  in  the  tests  and  their  use  is  necessary 
because  of  obsolescence  and  the  necessity  to  refine  sensitivity  at  various  cut-off  points  as 
these  are  changed  by  law  or  policy. 

The  program  described  in  this  report  led  to  the  development  of  the  tests  currently 
used  (October  1952)  for  initial  screening.  These  tests,  Armed  Forces  Qualification  Test 
AFQT-1,  and  Armed  Forces  Qualification  Test  AFQT-2,  were  based  on  extensive  research 
with  earlier  forms  of  screening  tests  and  on  follow-up  studies  to  determine  the  applicability 
of  AFQT  to  operating  conditions.  The  studies  relating  to  the  development  and  use  of  AFQT 
represented  the  joint  effort  by  Army,  Navy,  Marine  Corps,  and  Air  Force  research  per- 
sonnel by  direction  of  the  Department  of  Defense.  The  Army  was  designated  as  the  coordi- 
nating and  executive  agency  in  this  program.  The  earlier  tests  had  been  developed  for  Army 
use  only. 

The  purpose  of  this  repor*  is  to  srmmarinc  the  problems  encountered  in  L;  •clop- 
ment  of  AFQT  and  to  describe  the  outcome  of  attempts  to  solve  these  problems.  Many  of 
these  problems  arose  in  the  development  and  use  of  earlier  tests.  To  permit  a better  under- 
standing of  the  development  of  AFQT,  a summary  of  experience  with  earlier  tests  is  pro- 
vided as  a background. 


BACKGROUND 


CLASSIFICATION  VS.  SCREENING  TESTS 

Because  much  of  the  history  of  the  screening  tests  developed  under  this  program  is 
related  to  previous  and  concurrent  developments  of  classification  tes'..,,  and  because  similar 
score  conversion  systems  and  terminology  frequently  apply  to  both,  it  Is  important  to  clarify 
the  difference  between  these  two  types  of  tests. 

Classification  Ttsts  are  used  primarily  to  classify  men  on  the  basis  of  abilities  for 
assignment  such  as  officer  training  and  specialist  training.  These  tests  originally  measured 
Just  a few  abilities  (in  mvml  cases  simllar.to  those  currently  measured  by  screening  tests), 
and  in  the  Army  were  referred  to  as  "Army  General  Classification  Tests, " or  AGCT  tests. 
More  recently  P\e  classification  tests  have  been  expanded  to  a battery  of  specific  ability 
measures  caller:  "'9  "Army  Classification  Battery. " Classification  tests  are  administered 
to  men  at  rocepL..n  centers  after  they  have  been  accepted  'br  service  in  the  .Army. 


: cent n; 


. . -(  Tests  are  teste  used  at  re  railing  of  induction  stations  for  determination  of 

mental  rtness  nr  service  in  tho  Army,  Cut-off  scores  are  used  on  screening  tests  to 
ooternune  wnether  an  applicant  or  selective  service  registrant  will  be  accented  or  rejected 
tor  service  insolar  as  mental  qualifications  are  concerned.  Tests  labeled  AGCT  r-re  not 
used  in  screening,  although  items  from  the  AGCT  tests  have  been  used  to  comprise  screen- 
ing tests,  and  in  one  case  two  AGCT  tests  were  given  different  titles  and  used  for  screening 
purposes. 


ARMY  GENERAL  CLASSIFICATION  TESTS  ( AG^T  SERIES) 

The  Army  General  Classification  Tests  (AGCT  series!  were  introduced  in  1940. 

During  the  period  1940-1940  they  were  gradually  replaced  by  the  Army  Classification  Bat- 
tery, The  development  of  both  of  these  tests  set  precedents  with  respect  to  standardization 
methods  and  item  types  whicn  affected  the  construction,  standardization,  and  application  of 
screening  tests  developed  under  this  program.  For  that  reason,  it  is  well  to  review  briefly 
pertinent  aspects  of  the  history  of  AGCT  tests.  These  are  reviewed  below  in  chronological 
order  of  their  appearance. 

AGCT-la:  This  test  consisted  of  vocabulary,  arithmetic  reasoning,  and  block  counting 
items  in  equal  numbers  to  make  a total  of  150  test  items.  Items  were  of  the  multiple-choice 
type  with  four  alternatives,  and  the  test  was  scored  rights  minus  one-third  wrongs  to  yield 
a single  raw  score.  This  scoring  formula  was  used  because  at  the  time  it  was  believed  that 
a correction  for  chance  was  justified.  The  test  was  standardized  on  a population  of  CCC 
enrollees  and  soldiers,  all  white,  and  between  the  ages  of  20  through  29  (N  = 2675).  These 
men  were  divided  into  stratified  cells  based  on  combinations  of  age,  highest  school  grade 
reached,  and  geographical  area.  Then,  in  order  to  establish  norms  representative  of  the 
total  US  male  population  between  the  ages  20  through  29,  AGCT  scores  for  these  men  in  the 
various  stratified  cells  were  weighted  to  provide  representation  proportional  to  that  ^ such 
age -education -geographical  groups  In  the  1930  census.  The  distribution  of  scores  so  obtained 
was  adjusted  for  convenience  to  a standard  score  scale  of  Mean  = 100  and  Standard  Dc  /iatlou  « 
20.  Such  equivalent  standard  scores  were  obtained  for  each  AGCT-la  raw  score.  These  con- 
verted scores  became  known  as  "Army  Standard  Scores.  " The  Army  Standard  Score  sy  ao-i: 
has  been  applied  to  practically  all  subsequent  classification  and  screening  instruments. 

To  meet  operational  requirements,  mental  test  scores  were  grouped  broadly  on  the 
basis  of  Army  Standard  Scores  into  five  "Army  Grades"  or  mental  groups.  Originally  they 
were  as  shown  in  Table  1. 


Table  1.  Mental  group  classifications  of  Army  Standard  Sco.’es. 


Army  Grade  (Mental  Group) 

Army  Standard  Score  Range* 

I 

130  and  higher 

n 

110  - 129 

m 

90  - 109 

IV 

7D  - &P 

V 

69  and  lower 

• rercertages  of  the  Army  Population  falling  in  each  mental  croup 
varied  /'-mi  timo  to  time  with  changes  in  norm*.  These  percentages 
for  cui , nt  norm/,  are  shown  in  the  later  section  Operational 
* explication  of  AFQT-1  and  AFTJT -2 . 
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These  groups  remained  as  defined  above,  except  that  in  July  1942  the  above  limit  of 
group  IV  was  changed  from  standard  score  70  to  standard  score  60,  This  was  done  so  that 
the  distribution  cf  scores  would  correspond  better  with  the  distributions  anticipated  from 
operational  use.  Although  the  standard  score  ranges  for  groups  IV  and  V var  ied,  this 
grading  system  has  remained  with  the  Army,  and  equivalence  of  suoseqacnt  selection  anc 
classification  tost  scores  to  the  various  grade  level  limits  is  an  important  aspect  of  their 
standardization.  As  will  be  seer  later,  this  grade  system  in  1950  became  the  basis  for  con- 
trolling the  distribution  of  armed  forces  input  among  the  Army,  Air  Force,  Navy,  and 
Marine  Corps.  The  test  AGCT-la  was  ’ntrodueed  operationally  in  October  19-10. 

AGCT- lb;  This  test  had  the  same  general  composition  as  form  la.  it  contained  50 
arithmetic,  50  vocabulary,  and  50  Mock  counting  items.  The  50  block  counting  items  were, 
in  fact,  identical  to  those  in  form  la.  It  was  standmdized  on  a population  of  3,856  soldiers 
drawn  from  8 of  the  9 Army  Corps  areas.  The  method  used  was  line -of -regression  equation 
of  lb  scores  to  standard  scores  on  la.  The  test  was  introduced  in  April  1941. 

4Gcr~io  and  -id;  Each  of  these  tests  consisted  of  140  items— 47  vocabulary,  49  arith- 
metic, and  44  block  counting  •‘-arranged  in  spiral  omnibus  form.  These  items  also  had  four 
alternatives,  and  the  tests  were  scored  rights  minus  one -third  wrong,  as  in  AGCT-la  and 
-lb. 

Forms  lc  and  Id  were  standardized  on  a population  of  1, 782  soldiers  from  two  Army 
Corps.  Scores  on  forms  lc  and  Id  were  equated  to  Standard  Scores  on  form  la  by  a com- 
bination of  the  equipercentile  method  and  the  line -of -regression  method.  They  were  intro- 
duced operationally  in  October  1941. 

AGCT- 3ac  This  test  departed  from  previous  AGCT  forms  in  that  it  was  actually  a bat- 
tery of  four  tests  which  could  be  scored  in  total,  or  separately  in  order  to  provide  separate 
measures  of  the  abilities  measured  by  AGCT.  This  was  the  beginning  of  the  "Classification 
Battery"  idea.  The  four  component  tests  of  AGCT-3  were  (1)  Reading  and  Vocabulary, 

(2)  Arithmetic  Reasoning,  (3)  Arithmetic  Computation,  and  (4)  Pattern  Analysis.  Tuese  tests 
were  each  cf  the  multiple  choice,  four  alternative  answer  type,  and  were  scored  rights 
minus  one -third  wrongs.  The  total  score  was  the  sum  of  raw  scores  on  the  fcuor  component 

AGCT-3a  was  standardized  on  a population  of  39, 178  soldiers  stratified  by  Service 
Command,  color,  age,  and  education.  The  method  of  standardization  used  was  equiper- 
centile equating  of  AGCT-3a  raw  scores  to  Standard  Scores  on  AGCT-la  and  -lb. 

This  test  was  introduced  in  April  1945.  As  of  this  date,  it  is  still  used  by  the  Marine 
Corps  for  classification. 

AGCT- 3* : This  was  an  alternate  form  of  AGCT -3a,  having  the  same  composition  and 
item  types  but  using  different  items.  AGCT "3b  was  standardized  on  a group  Oj  1,000  s idlers 
at  Camp  Atterbury  selected  to  match  proportionally  the  numbers  of  white  and  colored  men 
and  the  distribution  of  AGCT-3a  scores  to  those  of  the  Army  as  a whole.  Standard  Scores 
for  raw  scores  on  AGCT-3b  were  obtained  by  equirercentile  equating  to  Standard  Scores  on 
AGCT-3a. 

This  test  was  introduced  fcr  alternate  use  Vth  AGCT -3a  shortly  after  VT  ‘av,  August 

1945. 


ARMY  CLASSIFICATION  BATTERY  (ACB) 

The  AGCT  tests  were  predominantly  of  the  verbal  type,  including  items  of  the  arith- 
metic reasoning,  -ocabulary,  reading  comprehension,  and  spatial  relations  types.  As  early 
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as  1941,  research  and  operating  experienc  indicated  the  need  for  supplementary  tests  for 
use  in  classification.  Beginning  in  1941,  specific  tests  such  as  Mechanical  Aptitude,  Cleri- 
cal Speed,  Radio  Code  Learning,  and  Automotive  Information  were  introduced  at  various 
times  to  supplement  AGCT  in  classification.  By  the  fall  of  1947  ten  such  tests  had  been  in 
use  for  classification  purposes;  but  interpretations  of  the  meaning  and  appropriate  use  of  the 
test  scores  varied  widely  because  classification  officers  diffe:'  -d  in  the  amount  of  their  tech- 
nical knowledge*  and  it  was  not  possible  at  the  time  to  ma>r  available  sufficient  data  on  validity 
and  interrelationships  so  that  even  technically-trained  personnel  could  make  optimum  use  of 
the  tests.  These  ten  tests  made  up  the  Army  Classification  Battery.  Work  was  begun  on  a 
continuing  program  to  study  various  combinations  of  these  tests  which  were  valid  f'v  groups 
of  Army  MOS's.  These  combinations,  predictive  of  performance  foi  similar  MOS  groups, 
were  called  "Aptitude  Areas. " By  the  spriny  of  1949,  the  ten  classification  tests  were  grouped 
into  ten  "Aptitude  Areas. " At  this  time,  the  Army  Classification  Battery,  making  use  of  the 
"Aptitude  Area"  system  for  classification  at  Reception  Centers,  was  introduced  officially  for 
classification  of  soldiers.  The  tests'  AGCT -3a  and  -3b  were  withdrawn.  It  should  be  noted, 
however,  that  three  of  the  subtests  of  3a  and  3b— Reading  and  Vocabulary,  Arithmetic  Rea- 
soning, and  Pattern  Analysis — we.r«  retained  as  separate  tests  in  the  Army  Classification 
Battery,  and  made  up  three  of  the  ten  classification  tests  in  the  Battery. 


DEVELOPMENT  AND  USE  OF  SCREENING  TESTS 

The  use  of  psychological  tests  during  World  War  U for  purposes  of  initial  screening  for 
Service  began  soon  after  Induction  became  effective.  Wartime  screening  for  mental  abilities 
passed  through  four  phases,  each  characterized  by  psychological  testing  procedures. 


The  first  phase  included  the  wartime  period  prior  to  August  1942.  Regulations  at  this 
time  excluded  from  military  service  all  men  who  did  not  have  the  capacity  for  "reading  and 
writing  the  English  language  as  commonly  prescribed  for  file  fourth  grade  in  grammar 
school. " It  was  further  prescribed  that  men  who  had  not  completed  the  fourth  grade  would  be 
tested  at  induction  stations  to  determine  whether  they  possessed  this  capacity.  Toward  the 
latter  part  of  tills  period  a test  called  the  "Minimum  Literacy  Test"  was  developed  for  this 
purpose.  There  were  12  forms  of  this  tes  with  12  simple  questions  each.  Each  form  had  a 
‘ passing  score  of  9 correct  answers,  which  was  considered  equivalent  to  fourth  grade  reading 
and  writing  ability,  and  set  to  differentiate  mental  group  IV  from  mental  group  V.*  Although  . 

6 of  these  12  test  forms  were  placed  in  the  hands  of  the  National  Selective  Service  Head- 
quarters for  use  by  local  boards  in  preliminary  screening,  local  boards  did  not  generally  use  f 

them.  Responsibility  for  screening  with  the  Minimum  Literacy  Test  was  placed  with  the  ; 

induction  station.  i 


The  second  phase  included  the  period  Av  gust  1942  to  June  1943.  During  this  period 
induction  of  men  who  could  not  meet  tie  above  literacy  standards  was  permitted  provided  they 
possessed  sufficient  intelligence  to  absorb  military  training  rapidly.  The  induction  of  such 
men  was  limited  to  fixed  quotas  in  terms  of  percentages  of  the  total  number  of  men  inducted 
at  each  station  each  day.  These  quotas  varied  from  10%  at  the  outset  to  5%  in  February  1943. 
This  regulation  introduced  to  the  Services  men  designated  as  illiterate,  aid  was  toe  beginning 
of  screening  on  the  basis  of  mental  ability  in  addition  to  literacy.  These  new  regulations 
required  a more  comprehensive  system  of  screening  at  induction  stations.  Cn  the  basis  of 
research,  a series  of  multiple  hurdles  was  applied  to  men  who  did  not  qualify  by  virtue  of 
fourth  grade  education.  Those  who  passed  any  one  hurdle  were  considered  mentally  accepta- 
ble. These  hurdles  consisted  of:  (1)  The  Army  Information  Sheet,  a revision  of  the  Minimum 
Literacy  Test  which  was  used  to  determine  acceptable  literacy;  (2)  The  Visual  Classification 
Test,  a non-language  group  test  of  mental  ability,  administered  to  those  who  did  cot  pass  on 
literacy;  and  (3)  A battery  of  two  individual  mental  tests,  the  Concrete  Directions  Test  and  the 
Block  Counting  Test,  given ' those  who  did  not  pass  the  group  mental  test.  Those  who  did 
not  qualify  in  any  of  these  tescs  were  rejected. 
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During  the  third  phase,  Jun.  1943, to  May  1944,  no  limits  were  placed  on  the  number  of 
"illiterates"  to  be  Inducted  because  of  the  ne*»)  for  rapid  expansion.  This  phase  also  differed 
from  the  previous  two  in  that  special  training  units  ware  set  up  to  give  these  men  literacy 
training  before  they  were  forwarded  to  regular  training  centers.  For  screening  at  this  tine, 
a new  test  called  the  Qualifoation  Test  replaced  the  Army  Information  Sheet  as  the  initial 
literacy  screening  instrument.  It  consisted  of  series  of  items— number  series,  space  orien- 
tation, arithmetic,  and  reading  and  vocabulary-arranged  in  cycles  of  increasing  difficulty. 

The  Visual  Classification  Test  and  the  individually  administered  Concrete  Dir  ections  and 
Block  Counting  tests  remained  in  the  bax.ery  as  before.  A passing  score  on  any  one  test  hur- 
dle qualified  a man  mentally. 

In  the  fourth  phase,  June  1944  to  the  end  of  World  War  Q,  a new  series  of  mental  tests 
which  had  been  subjected  to  considerable  test  construction  and  validation  study  was  introduced. 
Validation  studies  of  these  tests  were  made  with  full  consideration  of  the  need  for  distinguish- 
lng  between  literacy,  on  the  one  hand,  and  over-all  performance  as  a soldier,  on  the  other. 

This  is  a continuing  problem  in  the  validation  of  mental  tests  since  evidence  is  available  which 
indicates  that  these  two  aspects  are  not  hlgnly  correlated  with  each  other. 

The  new  series  of  tests  was  incorporated  in  a mental  screening  procedure  which 
included  the  following  instruments:  (1)  The  Qualification  Test,  which  was  used  as  in  the  pre- 
vious phase,  except  that  the  passing  score  was  raised  in  accordance  with  research  findings 
and  an  alternate  form  was  developed;  (2)  The  Group  Target  Test,  which  replaced  the  Visual 
Classification  Test;  (3)  The  Individual  Test;  and  (4)  The  Non-Language  Individual  Rumination. 

At  this  time  all  men  who  failed  the  Qualification  Test,  but  were  accepted  on  passing  a subse- 
quent hurdle,  were  placed  ir.  Special  Training  Units  before  being  assigned  to  Basic  Training. 

Those  who  could  not  learn  (as  determined  by  the  Specialized  Training  Unit)  the  required  aca- 
demic  and  military  subjects  during  a of  13  weeks  in  Special  Training  Upita  were 

discharged  and  returned  to  civilian  life. 

Screening  tests  during  World  War  H were  used  to  select  men  on  the  basis  of  very  low 
mental  standards,  i.  a. , those  who  did  not  possess  sufficient  literacy  or  mentel  abCtey  to 
absorb  the  most  elementary  training.  However,  this  experience  emphasized  the  value  of 
screening  on  the  basis  of  mental  ability— a practice  which  continued  after  the  wx.  Mghui 

mental  standards  could  be  applied. 


CLASSIFICATION  TESTING  CONCEPTS  OF  SIGNIFICANCE  TO  SCREENING  TESTING 

Several  concepts  utilized  in  the  development  of  classification  testing  in  the  Arm''  had  a 
significant  bearing  on  research  in  the  development  of  screening  tests  u.  * r this  prr  rrun , 
Those  concepts  which  are  most  important  to  this  program  are; 

f*«  "lrmy  sunimrd  Score*  Systtm,  The  system  of  conversion  of  raw  scores  cn  tests  b 
Army  Standard  Scores,  which  began  with  AGCT-la,  provides  an  established  frame  of  refer- 
ence for  interpreting  test  scores.  This  concept  has  remained  with  the  Army  to  the  present. 
The  "Army  Standard  Score"  distribution  was  originally  defined  as  hav*  a Mean  of  100  and 
a Standard  Deviation  of  20.  Though  the  Army  Standard  Score  is  still  th*  basic  means  of  con- 
verting raw  scores,  it  ao  longer  has  its  original  definition  in  terms  of  mean  a»ii  standard 
deviation  and  the  percentages  expected  from  probaL-lity  tables  no  longer  apply.  * hrough 
successive  use  of  tie-back  standardization  procedures  (to  be  described  below)  on  subsequent 
tests,  the  Army  Standard  Score  on  AGCT  and  similar  tests  has  come  tc  mean  the  score  cm 
those  tests  which  is  equivOont  to  that  standard  score  on  the  original  AGCT-la.  Thus,  for 
example,  a score  of  65  on  AFQT  means  that  an  equivalent  score  of  85  would  he  obtained  on 
AGCT-la,  regardless  of  the  percentage  of  men  making  that  score  at  any  given  time. 
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(instead  of  between  groups  IV  and  V),  as  was  presumed  to  be  required  for  the  higher  standards 
of  a peacetime  Army.  As  will  be  discussed  later,  this  system  of  mental  groups  also  was  used 
as  a basis  for  allocating  personnel  among  the  Army,  Navy,  and  Air  Force,  beginning  in  1950. 

Tie-Back  Standardisation  Methods,  In  a previous  discussion,  it  was  pointed  out  that 
AGCT  tests  developed  subsequent  to  AGCT-la  were  standardized  by  computing  raw  scores  on 
these  tests  equivalent  to  the  standar  d scores  originally  developed  for  AGCT-la  on  a group 
selected  represent  the  total  US  male  population  between  the  ages  of  20-29  years  in  1940. 

The  standard  scores  then  represented  norms  for  the  AGCT  tests.  This  tie-back  type  stand- 
ardization was  accomplished  by:  (1)  Choosing  a sample  population  within  the  Army;  (2)  Admin- 
istering the  test  to  be  standardized  and  a previous  form  of  AGCT  test  (reference  test)  upon 
which  standard  scores  had  been  established;  and  (3)  Computing  standard  score  equivalents  in 
the  reference  test  for  raw  scores  on  the  test  being  standardized  by  means  of  equipercentile 
equivalents  or  line-qf-regression  equivalents.  One  advantage  of  this  method  is  that  each 
standardization  does  not  require  a precisely  representative  sample  of  the  total  population  to 
which  the  test  norms  apply.  It  does  require,  however,  that  each  new  standardization  sample 
contain  a representation  of  cases  at  all  score  levels  throughout  the  total  range,  and  that  no 
extraneous  biasing  variables  be  introduced  in  selecting  the  sample.  Standardization  by  tie- 
back  methods  was  used  entirely  in  the  development  of  screening  tests  under  this  program. 


1 Interpretation  of 'test  scores  by  ref  pence  to  Army  Standard  Scores  has  become  very 

K 

ry 

familiar  to  Army  personnel  and  Army  Standard  Scor**  conversions  for  tests  of  the  AGCT  type 

...  r 

type 

are  almost  a necessity.  In  addition,  certain  standard  scores  which  have  been  used  at  various 

rious 

! times  as  reference  points  and  cutting  scores  have  become  important  bench  marks  by  which 

ch 

; operating  Army  personnel  evaluate  mental  ability. 

Mental  Groups.  The  five  mental  groups,  I,  n,  HE,  IV,  V (see  Table  1),  used  in  cla3si- 

*?i- 

fying  men  by  mental  ability  have  also  become  un  established  concept  in  the  Army.  These 

i 

mental  groups  were  used  as  a basis  for  allocating  men  within  the  Army  organization  to 
various  units.  Mental  groups  have  been  defined  in  equivalent  terms  for  all  AGCT  tests.  The 

The 

concept  of  mental  groupings  was  taken  over  later  for  screening  tests,  and  some  of  the  earlier 
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ORIGIN  0?  THE  NEED  FOR  SCREENING  TESTS 

At  the  close  of  World  War  H in  1945,  involuntary  inductions  were  stopped  and  voluntary 
recruitment  began  again  for  the  Armed  Forces*  The  screening  tests  used  for  inu~.tiou  of  ms-  - 
ginally  mental  and  illiterate  groups  were  dropped  from  use. 

. One  test  which  was. used  specifically  as  a screening  instrument  for  limited  service  per  - 
sonnel during  World  War  H,  showed  promise  as  a screening  device  for  recruits  in  peacetime. 
This  test  was  called  R-l(l),  introduced  in  October  1942  to  screen  inductees  who  were  limited 
physically  but  who  could  be  used  in  restricted  assignments.  It  was  a short  tec'  made  of  50 
items  from  AGCT-la  (17  vocabulary,  18  arithmetic,  and  15  block  counting  items).  Rems  for 
the  test  had  been  selected  so  as  to  give  maximum  discrimination  between  men  in  mental  gr  ade 
m and  those  in  mental  grade  IV.  The  test  was  standardized  and  calibrated  against  AGCT-  .a 
to  yield  raw  score  equivalents  to  AGCT-la  standard  scores.  A relatively  high  cutting  score  o? 
standard  score  90  on  R-l  was  established  as  a requirement  for  induction  of  such  personnel  to 
compensate  for  the  limited  physical;. 

In  April,  1946  the  test  R-l  was  introduced  officially  as  a means  of  determining  mental 
qualification  of  applicants  *cr  enlistment  in  the  Army,  and  a raw  score  of  15  (standard  score 
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2 . Development  of  Armed  Forces  Qualification  Test  (AFQT).  The  objective  of  this 
phase  was  to  develop,  through  joint  Army,  Navy,  and  Air  Force  effort,  a screening  instru- 
ment which  could  be  used  by  all  three  Services  at  their  main  examining  stations  lor  purposes 
of  determining  acceptance  or  rejection  of  enlistees  or  Inductees. 

3 . Follow-Up  of  the  Standardization  of  AFQT.  The  objective  of  this  phase  was  to 
recheck  the  standardization  of  AFQT  and  to  determine  the  effect  of  operational  administration 
of  the  test  upon  norms  and  standards  established  on  the  basis  of  its  original  standardization. 

Further  detailed  objectives  of  these  three  phases  will  be  discussed  m connection  with  the  fol- 
lowing report  of  research  accomplished. 


DMVIL0PM8MT  OF  RgCROITMMT  TESTS:  R-2,  R-3,  R-4,  R-5,  AMD  R-0 

Each  of  the  recruiting  tests,  R-2,  R-3,  R-4,  R-5,  and  R-6  were  first  developed  for 
use  as  screening  devices  at  main  examining  stations.  All  except  R-5  and  R-6  were  later 
transferred  to  use  as  prescreening  instruments  at  local  recruiting  stations  as  forms  were 
developed  for  use  at  main  stations.  Appendix  A lists  the  chronology  of  introduction  and  use 
of  these  tests  as  specified  in  Army  regulations. 


DIVILOPIUW  OP  R-2 

R-2  was  originally  intended  to  be  an  alternate  form  to  R-l  for  screening  limited  service 
inductees.  When  induction  became  the  sole  source  of  procurement,  work  on  R-2  was  discon- 
tinued. With  die  return  of  the  Army  to  a peacetime  basis  in  1946,  induction  was  discontinued 
and  voluntary  enlistment  became  the  sole  means  of  entry  into  the  Armv.  New  screening  tests 
were  needed,  and  among  these,  the  development  of  R-2  was  resumed  for  use  as  an  alternate 
to  R-l  administered  at  local  recruiting  stations. 

In  constructing  R-2,  35  items  were  -ielected{2)  from  AGCT-lb  (17  vocabulary  and 
arithmetic)  which  discriminated  between  men  of  mental  grades  m and  IV.  Hie  fifteen  block 
i counting  items  used  in  R-l  were  added  to  make  an  omnibus  50-item  test,  essentially  a short 
form  of  AGCT-lb.  The  fifteen  block  counting  items  were  retained  from  R-l  since  AGCT-la, 
of  which  R-l  was  a short  form,  contained  the  same  block  counting  items  as  AGCT-lb.  Hie 
vocabulary  and  arithmetic  items  for  R-2  were  selected  on  the  basis  of  the  significance  of  dif- 
ferences in  their  p-valuea  between  groups  of  500  mental  grade  in  and  500  mental  nrade  IV  me r-. 
from  various  reception  centers,  and  on  the  basis  of  matched  difficulty  dtetr&uuuu  with  simitci- 
content  items  of  the  test  R-l. 

R-2  was  standardized  fi-st  in  1942(3)  in  order  to  determine  "critical"  secure  equivalents 
to  standard  scores  of  90  and  ^.uO  on  the  AGCT-lc  scale.  The  population  used  was  a group  of 
- 375  men  who  came  to  the  reception  center  at  Camp  Lee,  Virginia,  on  25  and  26  March  1942. 

The  men  represented  all  five  mental  grades  on  AGCT-lc.  Both  AGCT-lc  end  L-2  ware  given 
to  the  group  using  a counterbalanced  order  of  administration.  Equivalent  scores  on  R-2  for 
standard  scores  of  90  and  100  on  AGCT-lc  were  determined  by  the  equlpercentile  method. 

* The  reliability  of  R-2  was  estimated  at.  94  and  its  correlation  with  AGCT-lc  at.  8 3.  The 
correlation  between  R-2  and  R-l  was  . 87. 

Later  in  1946,  in  connection  with  the  standardization  of  R-3  and  R-4,  the  test  R-2  was 
standardised  on  a group  of  700  enlistees  at  Camp  Atterbury  Reception  Center(4)  chosen  to 
represent  proportionally  the  distribution  of  the  general  Army  population  among  the  fire  rnentel 
grades  on  AGCT-Sa.  Equ'"r-lents  to  standard  scores  on  AGCT-Sa  for  each  raw  score  on  R-2 
were  obtained  by  the  equip^icentile  method. 
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'DEVELOPMENT  OP  R-3  AND  H-4 


Early  in  1946,  work  was  begun  on  a test  for  use  in  Army  anlietraent  which  coulc  be 
administered  individually  or  in  groups  with  a minimum  of  verbal  instruction,  could  be  com- 
pleted in  less  than  SO  minutes,  and  would  provide  scores  as  nearly  equivalent  to  AGCT-3a 
standard  scores  as  possible  within  the  lower  test  score  ranges. 

While  R-2  had  not  been  introduced  operationally  at  this  time,  it  seemed  probable  that 
items  selected  from  the  residue  available  after  constructing  AGCT-3:-  and  AGCT-3b  would 
correlate  more  highly  with  AGCT-3a  than  would  those  selected  from  AGCT-lb  for  R-2. 
Furthermore,  R-2  was  still  in  need  of  further  research.  It  was  decided  to  study  R-2  further, 
along  with  the  development  of  R-°  °nd  R-4,  and  to  compare  R-2  with  the  newly  developed  tests. 

Prt+antion  of  Exptrimtntal  Tttit  tf-j*  and  g- 4*.  Two  experimental  tests,  R-3x  and  R-4x, 
were  constructed^,  each  consisting  of  15  pattern  analysis,  10  arithmetic  reasoning,  and 
10  reading  and  vocabulary  items  from  toe  residue  of  those  employed  in  constructing  AGCT-3a, 
b,  c,  and  d.  Items  were  selected  on  the  basis  of  high  internal  consistency  appropriate 
difficulty  distribution.  Items  were  paired  for  the  two  forms  to  give  equivalence  in  terms  of 
content,  difficulty,  and  internal  consistency.  Directions  for  spatial  items  were  expanded  and 
illustrated  more  thoroughly  in  order  to  make  the  spatial  items  more  understandable  to  lower 
level  recruits. 

Administration  of  Bcptrimtntal  Ttttt.  Four  populations  were  tested  for  this  study  from 
enlistees  and  inductees  entering  Camp  Atterbury  Reception  Center,  Massachusetts,  between 
15  April  and  31  May  1943.  Tests  were  administered  to  these  populations  as  follows: 

Population  A - R-2,  AGCT-3a 
Population  B - R-3x,  AGCT-3a 
Population  C - R-4x,  AUCT-3a 
Population  D - R-3x,  R-4x,  AGCT-Sa 

After  testing  larger  groups,  each  population  was  chosen  so  as  to  duplicate  _■  tsjt- 
tionally  the  distribution  of  the  general  Army  population  in  1944  on  AGCT,  grade,  and  color. 
Examinees  were  asked  to  indicate  the  item  on  R-3x  and  R-4x  reached  at  the  end  of  15  and 
20  minutes,  so  that  these  time  limits  could  be  compared  with  the  standard  25  minutes  for 
those  tests  and  the  15  minutes  for  R-2.  The  scoring  formula  was  rights  minus  one-third 
wrongs  for  all  tests. 

fktultt  of  Administration  of  Exptrimntol  Ttstt.  Preliminary  results  1 inis  Str-v  v-  s. 

superiority  of  E-3x  and  R-4^  over  R-2  in  terms  of  correlation  with  AGCT-3a  at  various  points 
of  cut,  and  of  the  25-minute  time  limit  for  R-3x  and  R-4x  over  tin  15-  and  20-  minute  limits. 

Appendix  B shows  N's,  means,  standard  deviate  as,  and  correlations  between  the 
experimental  tests  and  AGCT-3a.  Both  R-2x  and  R-4x  correlated  approximately  . 85  with 
AGCT-3a  as  compared  to  . 79  between  R-2  and  AGCT -3a.  The  cor  lation  between  R-3x 
and  R-4x  was  . 80.  These  findings  suggest  that  R-3x,  R-4x,  and  AGCT -3a  are  homogeneous 
in  content,  and  that  the  shorter  tests  R-3x  and  R-4x  a re  somewhat  less  reliabl  - as  measures 
of  this  content. 

Conclusions.  1.  The  25-minute  time  limi*  ior  F.-3-  and  R-4x  gave  better  prediction  of 
AGCT-3a  within  the  desired  range  of  scores  than  the  15-  and  20-minute  limits. 

2.  The  R-3x  and  R-- ix  tests  were  more  accurate  predictors  of  AGCT-3a  wiihinHhe 
desired  range  than  R-2. 

3.  The  reliability  of  R-Sx  and  R-4x  was  adequate. 
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4.  The  mean  item  difficulty  (mean  p -value)  of  R-3x  and  R-4x  was  somewnat  low  for 
optimal  predictive  efficiency  within  the  range  of  primary  interest. 

Rtcomundations.  It  was  recommended  that  R-3x  and  R-4x  be  introduced  without  change 
in  item  content  and  that  revisions  b-,  initiated  to  decrease  the  mean  item  difficulty  (i.e. , to 
increase  the  mean  p- values). 

All  three  tests,  R-2,  R-3,  and  R-4,  wei-j  standardized  by  determining  raw  score 
equivalents  to  AGCT-3a  standard  scores  using  the  equipercentile  method.  Appendix  C shows 
the  standard  score  conversion  tables  derived  for  these  three  tests  from  the  aforementioned 
Camp  Atterbury  group. 

In  Augist  1946,  R-3  and  R-4  were  introduced  as  screening  tests  at  Central  Examining 
Stations,  wheie-a  standard  score  of  70  or  over  qualified  a man  for  enlistment.  The  R-2  test 
was  transferred  for  use  together  with  R-l  at  local  recruiting  stations  for  pre-screening 
applicants  befc  re  transmitting  them  to  Central  Examining  Stations. 


DEVELOPMENT  OP  H-5  AND  R-8 

Early  in  1948,  the  responsibilities  of  Central  Examining  Stations  for  final  examining 
of  recruits  was  transferred  to  the  Recruiting  Service  at  Main  Recruiting  Stations.  At  this 
time  it  was  decided  that  R-3  and  R-4  would  be  used  for  pre-screening  at  local  recruiting 
stations,  replacing  R-l  and  R-2. 

Another  test  was  needed  for  final  screening  at  Main  Recruiting  Stations.  The  AGCT-lc 
and  AGCT-id  were  republished  as  R-5  and  R-8  and  authorized  for  this  purpose.  While 
R-5  and  R-8  carried  booklet  covers  with  the  title,  "Classification  Test  R-5"  (or  R-8),  there 
were  nc  changes  in  the  content  of  the  test.  Therefore  the  norms  previously  developed  in 
standardization  of  AGCT-lc  and  AGCT-ld  were  used  for  R-5  and  R-6  without  further  standard- 
ization. Appendix  D shows  the  norm  table  for  converting  R-5  and  R-6  raw  scores  to  Army 
Standard  Scores. 

Later  in  .1948  these  same  R-5  and  R-6  tests  were  republished  with  new  covers  and 
entitled  General  Classification  Test  (GCT)  -5  and  -6. 

DEVELOPMENT  OP  THE  ARMED  FORCES  QUALIFICATION  TEST  (AFQT),  FORMS  1 AND 


TRANSITION  FROM  R- TESTS  TO  AFQT 

The  tests  R-5  and  R-6  (or  GCT-5  and  GCT -6)  continued  in  use  at  Main  Recruiting 
Stations  until  1 January  1950,  when  they  were  replaced  by  AFQT-1  and  AFQT -2.  The  develop- 
ment of  screening  tests  R-l  through  R-5  and  R-6,  had  been  a single  Service  (Army)  endeavor. 
With  the  passage  of  the  Selective  Service  Act  of  1948,  which  provided  that  the  Services  would 
not  reject  anyone  for  mental  reasons  who  had  Army  standard  scores  of  70  or  better,  the  need 
arose  for  greater  uniformity  in  mental  screening  procedures  among  the  Services. 

At  this  time  three  different  tests  were  in  use  among  the  various  branches  of  the  Armed 
Forces  to  evaluate  generally  similar  characteristics  of  h-ductees  or  applicants  for  enlistment. 
They  were: 


1/  Dr.  H.  Brandt  was  la’-ceiy  responsible  for  directing  the  development  of  AFQT-1  and  -2, 
and  for  the  first  draft  ..f  this  section  of  the  report. 
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Army  and  Air  Force:  Geuural  Classification  Test,  Forms  5 and  6. 

Navy:  Navy  Applicant  Qualification  Test,  Form  3. 

Marine  Corps:  Army  General  Classification  Tests,  Form  lc  and  Id. 

In  addition,  three  different  systems  of  converting  scores  and  reporting  results  of  tests 
were  in  use  by  the  various  Armed  Forces,  as  follows: 

Army  and  Maxine  Corps:  Standard  scores  based  on  a mean  of  100  and  standard  devia- 
tion of  20. 

Navy:  Standard  scores  based  on  a mean  of  30  and  standard  deviation  of  10. 

Air  Force:  Standard  scores  based  on  a mean  of  5 and  a standard  deviation  of  2. 

Anticipating  a request  for  uniform  screening  instruments,  an  unofficial  working  group 
of  technicians  from  the  three  Services  began  planning  a Joint  screening  test  in  1948.  On 
26  November  1948,  the  Advisory  Committee  on  Selective  Service,  Office  of  the  Secretary  of 
Defense,  recommended  that  this  working  group  be  given  official  sanction  as  a subcommittee 
to  study  uniform  screening  tests  and  scoring  systems  for  inductees  and  enlistees  in  the  Armed 
Forces.  The  subcommittee^/  was  officially  so  designated  on  27  January  1949.  It  was  by  joint 
research  that  the  Army,  Navy,  and  Air  Force,  through  *his  subcommittee,  developed  AFQT-1 
and  AFQT-2.  The  Department  of  the  Army  was  given  responsibility  for  direction  of  the  proj- 
ect and  for  analysis  and  presentation  of  data. 


SPECIFIC  OBJECTIVES 

The  problems  involved  in  the  development  of  this  common  screening  instrument  were 
manifold.  However,  it  was  matt  fundamental  to  determine  what  the  test  should  measure.  It 
was  decided  that  the  test  should  represent  a global  measure  of  mental  ability,  containing 
essentially  those  item  types  which  were  most  common  to  existing  screening  tests  in  all  Ser- 
vices. Therefore,  it  was  agreed  that,  as  previous  research  experience  indicated,  "'-i  tssl 
would  include  items  of  the  vocabulary,  arithmetic  reasoning,  and  spatial  relations  types. 

Other  problems  of  a more  specific  character  soon  became  apparent. 

1.  The  new  instrument  should  have  maximal  sensitivity  in  the  60  to  90  Army  Standard 
Score  range  as  well  as  adequate  distribution  of  scores  throughout  the  total  range  so 
division  into  five  grade  groups  for  allocation  of  personnel  to  the  varfe. . Services  c „ <!<?  bo 
accomplished. 


2/  Subcommittee  Military  personnel  were  as  follows;  Navy — Cmdr,  O.  E.  McCombs,  Chair- 
man; Army— Lt.  Col.  C.  G.  Dunn  and  Lt.  Col.  D.  B.  Routh;  Air  x orce—Maj.  Albert  L. 
KDnge;  Marine  Corps—Lt.  Col.  B.  D.  Godboid;  and  Coast  Guard— Cmdr.  E.  T.  Callahan. 
Psychological  Specialists  appointed  by  this  subcommittee  as  project  personae  included.; 

Dr.  J.  E.  Uhlaner,  D/A,  TAGO,  Program  Coordinator;  Dr.  H.  Brae  ft,  D/A,  TAGO, 
Project  Director;  Dr.  E.  G.  Brundage,  D/N,  BuPers;  Dr.  J.  H.  Criswell,  D/N,  BuPers;  • 
and  Dr.  Frank  A.  Geldard,  D/AF,  HRD.  Oth ar  technical  specialists  who  were  utilized  in 
the  development  of  At  «;T-1  and  -2  included:  Dr.  Glenn  Finch,  D/AF,  HRD;  Dr.  Donald 
Baler,  D/A,  TAGO;  Dr.  C.  I.  Mosier,  D/A,  TAGO.  The  following  sampling  specialists 
advised  on  tbe  design  of  the  standardization  plan:  Dr.  W.  E.  Doming,  Div.  Stai.  Stand- 
ards, Bureau  n*  Budget;  Dr.  B.  Tapping,  Sampling  Research  Section,  Bureau  oi  Census; 
and  Dr.  D.  Ci.upman,  RDB. 
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Collection  of  item  Analysis  Data,  A total  of  554  items  of  the  three  foregoing  types  was 
selected  and  item  indices  (difficulty,  validity,  internal  consistency,  and  independence)  were 
determined  after  experimental  administration  of  the  tests  to  approximately  7,000  men 
entering  the  Army,  Air  Force,  and  Navy. 

For  the  purpose  of  item  analysis,  the  items  for  each  type  were  divided  equally  into  two 
separate  booklets  and  arranged  in  order  of  guessed  difficulty.  These  experimental  booklets 
were  labeled  GCT-7x  and  GCT-8X.  11  e tests  were  administered  during  the  month  of  October 
1948  to  Army,  Air  Force,  and  Navy  personnel  at  recruiting  stations  and  training  divisions, 
selected  to  give  a good  spread  of  the  population  on  the  reference  tests.  The  experimental 
* tests  were  administered  following  the  official  recruiting  tests  which  were  used  to  establish 

acceptance  or  rejection.  The  reference  tests  for  me  Army  and  Air  Force  populations  were 
the  recruiting  tests  GCT-5  and  GCT-fiel/  or  AGOT-3a.  For  the  Navy  population  the  reference 
test  was  the  recruiting  test,  Applicant  Qualification  Test  (AQT-3). 

A total  of  7, 114  cases  was  tested  on  both  forms  of  GCT-7x  and  GCT-8x  of  which  389 
were  discarded  because  of  lack  of  complete  data.  The  breakdown  by  form  number,  by 
recruiting  stations,  and  training  divisions  is  shown  in  Appendix  E. 

Further  adjustments  were  made  to  Obtain  the  same  racial  proportions  reported  for  the 
entire  period  of  World  War  H.  After  these  adjustments  which  reduced  the  number  of  Negroes 
to  10%  and  other  non-whites  to  1%  of  the  total  sample,  5, 742  cases  remained  for  the  various 
phases  of  the  item  analysis. 

For  die  purpose  of  item-difficulty  analysis,  all  available  cases  (5, 742)  were  used. 
Three  populations  were  set  up,  one  for  each  of  the  reference  tests: 

1 . For  the  first  (Navy)  population,  the  cases  obtained  in  the  recruiting  stations  and 
training  divisions  were  combined  and  AQT-3  was  used  as  the  reference  test. 

2.  The  second  population  was  a combined  Army-Air  Force  group  from  recruiting  sta- 
tions and  training  divisions,  and  the  recruiting  test,  R-5  antfR-6  was  used  a*  .jus  t of  1'ae 
criterion. 

3.  Hie  third  population  was  drawn  only  from  the  training  divisions  of  the  Army  and 
Air  Force,  and  AGCT-3a  was  used  as  one  part  of  the  criterion. 

These  populations  were  normalized  before  item  validities  (biserial  correlation  •- c-^f- 
cients}  were  obtained.  For  the  two  Army-Air  Force  populations  a A 100  and  - 
Standard  Deviation  of  20  were  used;  for  the  Navy  population  the  Mean  was  00  -.ad  the  Standard 
Deviation  was  10. 

For  the  purpose  of  obtaining  item  validities,  three  populations  were  set  up  based  on  • 
each  of  the  three  reference  tests.  These  populations  were  normalized  before  the  biserial 
coefficients  were  obtained. 

After  the  populations  were  normalized,  their  sizes  were  as  shown  in  Table  2. 


3/  Since  GCT-5  and  GCT-8  were  the  previous  R-5  and  R-6  tests,  the  latter  designation  will 
be  used  below  for  convenience  in  referring  to  tee  reference  tests  for  the  Army  and  Air 
Force  populati . .s. 
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Table  2,  Populations  used  in  it'  cn  validity  analysis  of  GCT«7x  and  GCT-8x. 


Group 

GCT-7x 

GCT-Ox 

AQT-3  (Navy) 

264 

300 

R-5,  R-6  (Army-Air  Force) 

500 

300 

AGCT-3a  (Army-Air  Force) 

400 

558 

For  the  internal  consistency  and  independence  analyses,  it  was  not  necessary  to  keep 
the  populations  distinct  as  to  Service  since  scores  for  all  testees  on  all  parts  of  the  tests 
were  available.  The  three  normalized  populations  used  in  the  validity  analysis  were  com- 
bined and  split  into  two  equally  distributed  halves  (sample  SI  and  sample  32)  shown  in  Table  3. 


Table  3.  Populations  used  In  internal  consistency  and  independence 
analyses  of  GCT-7x  and  GCT-8x. 


Group 

GCT-?x 

GCT-8x 

SI  - Split  half  of  combined  AQT-3, 

582 

579 

R*5|  R~6y  and  AGCT*3& 

S2  - Other  half  of  combined  AQT-3, 
R-5,  R-6,  and  AGCT-da 

582 

579 

Sample  SI  was  used  to  obtain  the  biserial  coefficients  of  each  item  with  the  total  score 
of  each  of  the  three  subtests.  Sample  S2  was  used  to  obtain  a second  biserial  coefficient  ?.  ' 
the  item  with  the  total  score  of  the  test  of  which  it  was  a part. 


Rttultx  of  It**  Analyst* 

1 . Item  difficulty  analysis. 

A new  type  of  item  difficulty  index  was  developed  in  this  study.  The  item  diflic  lty 
level  assigned  was  the  lowest  reference  test  score  interval  at  which  an  item  was  answered 
correctly  by  at  least  50%  of  the  group  in  the  interval.  This  was  done  for  each  of  the  two 
reference  scores.  The  criterion  scores  were  grouped  into  nhie  intervals  and  coded  as  shown 
in  Table  4. 

For  each  item  in  the  experimental  tests  GCT-7x  and  GCT-8x,  three  difficulty  indices 
were  secured  based  on  A^T-3,  R-5  or  R-6,  and  AGCT-3a  scores. 
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Table  4.  Equivalence  oi  difficulty  levels  to  standard  score  intervals  on  Navy 
and  Army-Air  Force  tesfe>. 


Difficulty 

Level 

Navy  (AQT-3) 

Army-Air  Force 
(R-5.  R-6,  AGCT-3a) 

1 

20  and  below 

59  and  below 

2 

30-34 

60-69 

3 

35-39 

70-79 

4 

40-44 

80-89 

5 

45-49 

90-98 

8 

50-54 

100-109 

7 

55-59 

110-119 

8 

60-64 

120-129 

9 

65  and  above 

130  and  above 

2.  Item  validities. 

The  scores  on  the  112  vocabulary  items  of  GCT-7z  and  GCT-8x  were  correlated 
with  the  items  on  the  reference  tests  AQT-3,  R-5,  and  R-6,  and  AGCT-3a.  R can  be  seen 
from' Table  5 that  each  of  the  reference  tests  yielded  approximately  the  same  average  item 
validity  for  both  experimental  tests. 


U 


! 

j 

i * 
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Table  5.  Average  item  validities  of  the  vocabulary  items  of  GCT-7x  and  GOT-8x. 


GCT-7X  GCT-8x 


AQT-3 

£S2Xs2& 

AQT-3 

R-6 

AGCT-3a  ! 

Mean  (item  b aerials) 

.44 

.45 

.43 

.45 

• O \J 

.45 

Standard  Deviation 

.18 

.15 

.13 

.18 

• 

I-* 

00 

.12 

Number  of  items 

112 

112 

112 

112 

112 

112 

Far  the  arithmetic  reasoning  items  the  averr  ge  item  validity  (3ee  Table  6)  or  the 
AQT-3  (Navy)  criterion  is  slightly  lower  than  for  the  other  two  on  both  experimental  tests. 
This  is  understandable  since  the  AQT-3  is  more  highly  verbal  than  either  the  R-5,  R-8,  or 
AGCT-3a. 
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R-5  AGCT-3a  _S=6  AOSZSl 

Mean  (itam  bigeriala)  .20  .3^  .35  .31 

Standard  Deviation  .13  .13  .15  .12 

Number  of  items  91  91  91  91 


3.  Internal  consistency. 

The  mean  Item  biserials  for  the  original  items  were  computed  for  each  of  the  three 
types  of  content  and  for  each  sample  (SI  and  32).  hi  each  case  Ahe  particular  tc*al  test  score 
was  the  criterion.  The  values  shown  in  Table  8 were  obtained. 
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Tfible  8.  Internal  consistency  coefficients  for  the  items  of  GCT-7x  and  GOT-8x. 


Vocabulary  • 

Arithmetic 

Reasoning 

Spatial 

Relations 

• • 

7x 

SI  S2 

8x 

si  S2 

7x 

SI  S2 

8j 

SI  S2 

7x 

SI  S2 

8x 

SI  S2 

Mean  (item  biserials) 

.55  .64 

.84  .61 

.66  .62 

.69  .58 

.49  .49 

.43  .48 

Standard  Deviation 

.17  .25 

.25  .24 

.14  .11 

.16  .20 

.22  .15 

.13  .13 

Number  of  items 

112  112 

112  112 

75  7b 

75  75 

91  91 

91  91 

It  can  be  seen  that  the  average  correlations  are  quite  consistent  for  the  two  samples 
and  are  about  the  same  for  both  experimental  tests. 

4.  Independence. 

To  arrive  at  an  index  of  independence,  performance  on  an  item  of  one  type  was 
correlated  against  the  total  score  achieved  on  each  of  the  other  two  types  A item.  For  exam- 
ple, vocabulary  performance  was  correlated  against  total  score  on  ariti  metic  reasoning  and 
spatial  relations.  The  mean  item  biserials  are  given  in  Table  9. 


Table  9.  Independence  coefficients  for  the  items  of  GCT-7x  and  GCT-8x. 


► 


Vocabulary  Rems 

Arithmetic  Reasoning 
Rems 

Arithmetic 

Reasoning 

Score 

Spatial 

Relation 

Score 

Vocabulary 

Score 

Spatial 

Relation 

Score 

7x  • 3x 

7x  8x 

7x  8x 

7x  8z 

Mean  (item  biserials) 

.53  .56 

.43  .27 

,38  .47 

.41  .35 

Standard  Deviation 

.21  .21 

,21  .11 

,li  .12 

Number  of  items 

112  112 

112  112 

75  75 

DB 

Spatial  Relations 
Rems 


Vocabulary 


w .Ore 

I Scc.ra 

7x  8x 

1 7x 

i 

tbr 

.22  .31 

.34  .09 

.10  .10 

.15  .13 

91  91 

91 

91 

Examination  of  these  coefficients  reveals  that  performance  on  the  spatial  relations 
items  is  least  related  to  performance  on  both  the  vocabulary  and  arithmetic  reasoning  items. 
The  relatively  high  relationship  between  vocabulary  and  arithmetic  reasoning  is  probably  the 
result  of  the  vert../  element  still  present  in  the  arithmetic  items. 


i 
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Solution  of  Ittm  for  AFQT,  Forms  lard  t.  After  the  item  analysis  of  GCT-7x  and  GCT-8x 
the  task  was  to  develop  two  forms  of  the  screening  instrument  which  came  to  be  known  as 
the  Armed  Forces  Qualification  Test  (AFQT-1  and  AFQT-2). 

It  was  decided  that  the  test  should  be  short  enough  so  that  when  fitted  into  4b  minutes 
of  working  time  it  would  be  a power  rather  than  a speed  test.  Data  obtained  from  the  experi- 
mental administrations  indicated  that  the  final  forms  of  the  tests  would  best  measure  power 
if  there  were  90  items;  30  each  fcr  vocabulary,  arithmetic  reasoning,  and  spatial  relations. 
Furthermore,  the  selection  of  the  items  was  determined  by  the  number  of  items  attempted 
by  85%  of  the  populations  to  which  the  experimental  forms  were  administered. 

Items  were  selected  for  the  AFQT  first  on  difficulty  level  with  priority  given  to  those 
items  whose  levels  were  identical  for  the  three  criterion  groups.  For  those  items  whose 
, cLIfdculty  level3  were  identical  for  only  two  of  the  criterion  groups,  that  particular  level  was 
used.  Where  the  difficulty  levels  of  an  item  differed  among  the  three  criterion  groups  and 
the  item  had  to  be  used,  an  average  difficulty  level  was  derived. 

Each  group  of  30  items  was  distributed  among  the  various  difficulty  levels  as  shown  in 
Table  10. 


Table  10.  Distribution  of  difficulty  level  in  AFQT,  Forms  1 and  2. 


Difficulty  Level 

1 and2 

3 

4 

5 

6 

7 

8 

9 

GCT  (Standard  Score) 

Below  70 

70-79 

80-89 

90-99 

100-09  110-19 

120-29 

1304- 

% (of  total  item  content) 

20 

16f 

15£ 

«t 

10 

10 

4 

«§ 

[ No.  of  items 

8 

5 

4 

5 

3 3 

2 

2 

Once  the  items  were  sorted  according  to  the  nine  difficulty  levels,  the  m*,'rdtude  of  «=<» 
over-all  index  became  the  next  guide  for  selection.  The  over-ail  index  was  a weighted  com- 
posite of  item  biserial  correlation  coefficients  equal  to  the  sum  of  the  three  validity  coeffi- 
cients plus  the  sum  of  the  two  internal  consistency  coefficients  minus  the  sum  of  the  two 
Independence  coefficients.  This  over-all  index  was  an  appropriate  criterion  of  the  general 
effectiveness  of  an  item,  hi  addition,  the  individual  coefficients  were  carefully  checked  to 
insure  the  selection  of  items  with  the  highest  validity  and  internal  consistency  together  with 
maximum  independence. 

The  foregoing  indices  were  considered  also  in  regard  *o  the  construction  of  the  two 
alternate  forms  of  AFQT.  In  this  connection,  a "percent  passing"  index  was  obtained  for 
the  normalised  sample  and  items  were  paired  with  respect  to  tkeir  p-values.  In  addition 
to  matching  by  difficulty  level,  comparability  of  AFQT-1  and  AFQT-2  was  further  assured 
by  ci^cxing  the  items  for  similarity  in  content  and  psychological  process. 


In  Table  11  are  presented  the  results  of  pairing  the  items  on  difficulty  level  and  the 
distributions  for  each  of  t;  --  three  types  of  material  for  the  two  forms. 
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Table  11.  Comparability  of  item  difficulty  levels  in  AFQT-1  and  AFQT-2. 


% Passing 

Vocabulary 
AFQT-1  AFQT-2 

Arith.  Reasoniy.o 
AFQT-1  AFQT-2 

Spatial  Relations 
AFQT-1  AFQT-2 

Total 

AFQT-1  AFQT-2 

95 

Q 

V 

3 

. 

3 

j 

3 

90 

5 

2 

2 

3 

- 

- 

77 

\ 

5 

85 

1 

3 

2 

4 

1 

1 

4 

8 

80 

- 

2 

6 

2 

1 

2 

7 

6 

75 

3 

3 

1 

4 

4 

1 

8 

8 

70 

2 

2 

2 

2 

2 

4 

6 

8 

. 65 

3 

2 

3 

3 

3 

4 

9 

9 

60 

1 

- 

2 

1 

3 

4 

6 

5 

55 

2 

4 

1 

1 

3 

2 

6 

7 

50 

4 

- 

1 

1 

4 

3 

9 

4 

45 

1 

3 

2 

» 

1 

2 

4 

5 

- 40 

1 

2 

1 

3 

2 

1 

4 

6 

35 

- 

1 

2 

- 

2 

2 

4 

3 

30 

2 

1 

1 

1 

2 

1 

5 

3 

25 

• 

- 

. 

1 

1 

1 

1 

2 

20 

1 

1 

1 

2 

1 

• 

3 

3 

15 

1 

1 

2 

1 

• 

1 

3 

3 

10 

1 

- 

1 

1 

- 

1 

1 

2 

N 

30 

30 

30 

30 

SC 

30 

90 

90 

Mean 

66.83 

66.17 

61.00 

62.83 

57.33 

57.17 

61.20 

Cl.  55 

Stand.  Dev. 

22.87 

22.65 

23.92 

24.73 

17.10 

18.62 

21.85 

22.45 

A listing  of  the  item  analysis  results  for  the  90  items  chosen  for  each  form  of  AFQT 
is  to  be  found  in  Appendix  F.  This  Appendix  contains  the  three  difficulty  indices,  the 
percent  passing,  the  three  validity  coefficients,  the  two  internrl  consistency  coefficients, 
the  two  independence  coefficients,  and  the  over-all  index. 


STANDARDIZATION  OP  APQT 

Population  and  OtiUctiss  a f Data,  The  population  on  which  norms  for  the  AFQT  were 
established  was  a sample  of  the  total  potential  military  popu?  alien  under  emergency  mobiliza- 
tion conditions.  The  choice  of  this  population  was  made  because  successful  military  opera- 
tions require  planning  for  mobilization  and  because  it  would  aid  in  arriving  at  an  equitable 
distribution  of  the  available  manpower  pool  among  the  various  Services  in  the  event  of  an 
emergency.  The  problem,  then,  was  to  decide  what  kind  of  sampling  was  neeat-  to  give  a 
representation  of  the  total  potential  military  population.  Two  alternatives  were  possible. 

The  entire  civilian  population  could  be  sampled  or  nee  could  be  made  of  r.  previous  popula- 
tion for  which  data  were  already  available.  The  decision  was  made  to  use  a previous  popu- 
lation for  the  following  reasons: 

1 . It  was  assumed  that  the  millions  of  men  available  for  testing  prior  to  Dectr  ^cr  1944 
would  not  differ  • sectially  in  age,  education,  occupational  status,  geographic  distribution, 
etc. , from  a similar  population  to  be  utilized  five  or  ten  years  later. 
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2 . The  use  of  data  on  hand  would  be  more  economical  than  testing  a sample  of  the 
entire  civilian  population. 

The  population  selected  was  that  of  all  the  men  on  duty  in  all  the  Services  as  of 
31  December  1944.  This  included  enlisted  men,  officer  caniioctes,  officers  risen  from  the 
ranks,  and  officers  who  had  been  commissioned  directly  from  civilian  life.  Since  many  of 
the  officers  who  had  been  directly  commissioned  had  not  been  tested,  corrections  were 
applied  to  the  score  distributions.  These  corrt  -tions  proved  to  be  minor  (for  further  dis- 
cussion, see  Appendix  G).  The  AFQT  scores  were  assigned  in  accordance  with  the  percent- 
ages falling  between  110  and  162  based  upon  the  obtained  distribution  for  enlisted  men. 

All  scores  were  converted  to  a common  base,  the  Army  Standard  Score  System.  After 
the  scores  were  converted,  the  distributions  were  blown  up  to  the  total  31  December  1944 
strength  (11,694,229).  A composite  cumulative  percentile  curve  was  set  up  in  5-point  Army 
Standard  Score  intervals  and  the  percentage  of  the  total  distribution  was  calculated  for  each 
interval.  Appendix  G presents  a more  detailed  account  of  the  derivation  of  this  distribution 
of  Army  Standard  Scores. 

Four  samples  of  1,000  cases  each  were  selected  to  reproduce  the  distribution  for  the 
entire  population.  AFQT-1  was  administered  to  two  of  these  samples.  In  one  of  the  samples, 
AFQT-1  was  administered  before  the  reference  test  (order  D;  in  the  other,  AFQT-1  was 
administered  after  the  reference  test  (order  H).  AFQT-2  was  administered  to  the  other  two 
samples  in  the  same  two-  orders.  The  purpose  of  the  two  orders  was  to  control  the  effect  of 
order  of  administration.  Both  orders  for  each  form  were  used  in  the  development  of  the  con- 
version tables. 

The  Army's  portion  of  the  standardization  population  was  obtained  from  three  Army 
training  divisions  (3rd  Armored  at  Fort  Knox,  Kentucky,  5th  Armo/ed  at  Camp  Chaffee, 
Arkansas,  and  10th  Infantry  at  Fort  Riley,  Kansas).  The  Air  Force  s portion  came  from 
the  Air  Force  Indoctrination  Center  at  Lackland  AFB,  Texas,  and  the  Navy's  portion  was 
obtained  from  training  centers  at  Great  Lakes,  Illinois,  and  San  Diego,  California.  The 
cases  consisted  of  all  incoming  new  recruits  at  these  installations.  They  were  tested 
during  July  1949.  Additional  testing  was  necessary  in  order  to  fill  in  the  gaps  at  botn  etds 
of  the  population  which  were  caused  In  pari  by  the  use  of  enlistment  cutting  scores  by  the 
three  Services.  These  additional  cases  were  obtained  by  the  Army  from  selected  Installa- 
tions. 

Statistical  Treatment.  The  AFQT's  for  each  of  the  four  groups  were  scored  and  plots 
were  made  of  the  cumulative  percentile  distributions  of  raw  scores  for  each  or-*  • of  the 
two  forms  of  the  test.  The  scoring  formula  used  was  rights  only.  An  examination  of  these- 
percentile  curves  showed  very  little  discrepancy  In  the  two  orders  of  administration  for 
either  form.  Accordingly,  the  orders  were  combined  and  the  distributions  for  each  form 
were  plotted.  The  differences  between  the  two  forms,  particularly  at  the  proposed  cutting 
points,  were  so  slight  that  the  use  of  a single  conversion  table  appeared  justified. 

The  similarity  of  the  two  forms  of  the  test  was  further  confirmed  by  another  study  in 
which  tha  two  forms  were  compared  on  an  additional  Army  group  of  600  cases.  For  'this 
group  there  was  an  average  practice  grin  of  only  two  raw  snre  points  on  either  form.  A 
correlation  of . 93  between  the  two  forms  was  obtained.  As  a result  of  this  similarity 
between  the  two  forms,  the  distributions  of  the  four  standardisation  samples  (4, 000  cases) 
were  combined  into  a single  distribution  and  a single  percentile  curve  was  plotted. 

By  means  of  equipercentile  conversion,  the  AFQT  scores  were  translated  into  Army 
Standard  Scores.  Thus,  any  AFQT  raw  score  or  equivalent  percentile  score  could  be  inter- 
preted In  terms  of  the  co:.  - jnticnal  Army  scale. 
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Rt  suits,  The  percentile:,  obtained  were  compared  with  the  expected  percentile 3 of  the 
normal  curve.  This  comparison  for  the  si.  ndard  score  more  commonly  used  administra- 
tively is  shown  in  Table  12. 


Table  12.  Comparison  of  expected  and  obtained  percentiles  for  AFQT  la  'QT 

standard  scores.  jfln 


Standard  Score 

Percentiles  Expected 

Percentiles  Obtained 

130 

93.3 

93 

120 

84.1 

82 

110 

69.2 

65 

100 

50,0 

49 

90 

30.9 

31 

80 

15.9 

21 

70 

6.7 

13 

60 

2.3 

7 

50 

0.6 

3 

It  can  be  seen  from  Table  12  that  the  obtained  percentiles  were  fairly  close  to  the 
expected  for  standard  scores  above  80.  Below  the  standard  score  of  80,  there  was  a greater 
| discrepancy. 

One  interpretation  of  this  finding  is  that  at  the  low  end  of  the  distribution,  the  score 
achieved  is  influenced  by  both  lack  of  mental  ability  and  by  illiteracy,  as  sugg&**»~ ' vy  thx 
* content  of  the  test.  In  line  witu  this  interpretation,  it  is  advisable  to  supplement  the  verbal 

type  of  tests  with  non-verbal  materials  which  will  permit  illiterates  to  demonstrate  their 
mental  ability,  provided  that  special  training  is  made  available  for  illiterates. 

Appendix  H shows  conversions  of  AFQT-1  and  AFQT -2  raw  scores  to  percentiles  and 
Army  Standard  Scores  as  derived  from  this  standardization  study. 

The  relationship  of  AFQT  with  previous  Army  aptitude  tests  was  examined.  AFQT 
was  found  to  be  highly  correlated  with  the  reference  test  AGCT-lc  or  AGCT-’d,  regardless 
of  the  order  of  administration  (Table  13). 


Table  13.  Correlations  between  AGCT  and  AFQT, 


AGCT-lc  or 
AGCT-ld  with: 

— 

AFQT-1 

AFQT-2 

AFQT 

fotal 

Group 

(order  D 

(order  H) 

(order  1) 

(order  12} 

Correlation  (r) 

.91 

.90 

.91 

.90 

.90 

Mean 

99.6 

54.9 

55.9 

56,7 

56.9 

56.1 

Standard  Deviaf.  on 

22.7 

19.0 

18. 7 

19.3 

18.9  11  19.0 

N 

4000 

1000 

1000 

1000 

— 

1000  I 

4000 
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AFQT  was  found  to  be  substantially  correlated  with  the  individual  tests  comprising 
Aptitude  Area  I (Reading  and  Vocabulary,  Arithmetic  Reasoning,  and  Pattern  Analysis), 
and  even  higher  with  Aptitude  Area  I (Table  14).  These  correlations  were  obtained  for  the 
Army  portion  of  the  AFQT-1  administered  in  the  order  I sample  only. 


Table  14.  Correlation  between  AFQT  and  Aptitude  Area  I tests. 


AFQT  with: 

Reading 
and  Vocabulary 

Arithmetic 

Reasoning 

Pattern 

Analysis 

Aptitude  Area  I 

Correlation  (r) 

- - 

.83  ' 

.87  " 

.75 

.92 

Mean 

53.5 

97.7 

89.4 

95.7 

94.3 

Standard  Deviation 

18.3 

23.5 

24.5 

25.8 

21.9 

N 

552 

552 

552 

552 

552 

The  number  of  years  of  education  was  correlated  with  scores  on  AFQT  (order  1)  and 
R-5  and  R-6  (Table  15). 


Table  15.  Correlations  between  years  of  education  and  AFQT-1,  R-5,  and  R-6. 
(N  = 929) 


Mean 

Standard 

Deviation 

Correlation  With  1 

Years  of  Education  1 

AFQT-1 

55.0 

19.3 

.69  1 

R-5,  R-6 

98.9 

22.8 

.67  J 

Years  of  Education 

10.2 

2.1 

i 

- AFQT  depends  less  on  speed  than  does  R-5  and  R-6.  Distributions  of  the  last  item 

attempted  showed  that  51%  of  the  men  completed  AFQT-1,  64%  completed  AF^  7-2,  and  only 
4%  completed  the  R -tests.  Another  comparison  showed  that  on  the  average,  92%  of  the  items 
on  AFQT-1  were  attempted,  94%  on  AFQT -2,  and  only  64%  on  R-tests. 


. OPERATIONAL  APPLICATION  OF  AFQT-1  AND  AFQT- 3 

Eniistmnt,  AFQT-1  and  AFQT -2  were  put  in  operation  for  screening  of  enlistees  at 
Navy  and  joint  Army-Air  Force  recruiting  stations  on  1 January  1950.  The  two  forms  of 
AFQT  replaced  the  R-5  *r<i  R-6  tests  for  screening  at  Main  Recruiting  Stations  and  local 
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recruiting  stations  continued  to  S3  P.-3  ana  R-4  for  pre-screecing  before  forwarding  appli- 
cants to  Main  Stations.  The  AFQT  then  became  the  common  mental  screening  instrument  for 
all  Armed  Forces. 

Induction.  Induction  had  been  used  very  little  by  the  Army  during  the  months  at  the  end 
of  1948  and  beginning  of  1949.  However,  in  July  1950,  under  the  1950  extension  of  the 
Selective  Service  Act,  the  Army  began  active  procurement  of  inductees  through  Selective 
Service.  The  Department  of  the  Army  was  designated  as  executive  agent  for  Joint  Armed 
Forces  Examining  and  Induction  Stations  which  would  process  inductees  for  all  Armed  Forces 
at  such  time  as  the  other  Services  should  call  upon  Selective  Service  fo*  procurement  of 
personnel.  Regulations  for  screening  inductees  provided  for  use  of  AFQT-1  and  AFQT -2  with 
a minimum  acceptable  score  of  percentile  13  (raw  score  31).  These  regulations  also  provided 
that,  should  other  Services  place  calls  for  inductees,  registrants  would  be  allocated  to  the 
Services  proportionally  within  mental  groups  I,  n,  III  and  IV.  Mental  group  V is  composed 
of  those  scoring  below  the  cut-off  score. 

Introduction  of  th*  -Averted  Scort.  ' As  AFQT-1  and  AFQT-2  were  used  in  induction  and 
recruiting,  it  was  noted  that  the  norms  for  AFQT  in  terms  of  Army  Standard  Score  equiva- 
lents did  not  seem  to  be  the  same  for  operationally  obtained  results  as  those  established  in 
the  original  AFQT  standardization.  This  was  particularly  noted  at  reception  centers  where 
the  Army  Classification  Battery  was  being  administered.  An  abnormally  large  proportion 
of  new  men  who  had  passed  AFQT  at  induction  and  recruiting  stations  were  found  to  score 
below  the  equivalent  Army  Standard  Score  on  Aptitude  Area  I of  the  Army  Classification 
Battery.  Aptitude  Area  I was  composed  of  components  of  the  old  AGCT-3a  and  AGCT~3b 
and  had  been  previously  found  to  correlate  . 92  with  AFQT.  This  pronounced  discrepancy 
between  AFQT  and  Aptitude  Area  I scores  was  termed  "Operational  Slippage, " and  was 
attributed  to  nor*  standard  administration  of  AFQT  at  induction  and  recruiting  stations. 

hi  order  to  correct  conversions  of  AFQT  for  non-standard  administration  of  the  test  in 
the  field,  a new  conversion  table  of  raw  scores  into  "Converted  Scores"  was  placed  into 
effect  10  July  1950,  for  all  Services  using  Selective  Service.  This  table  is  cbcwn  in.  Appen- 
dix I.  The  new  table  did  not  explain  what  a "Converted  Score"  was.  However,  in  appearance 
it  is  similar  to  a percentile  sc^re.  Its  range  is  from  1 to  100,  and  it  takes  the  • of  an 
ogive  curve  when  raw  scores  are  plotted  against  "Converted  Scores. " In  the  new  table  there 
was  considerable  agreement  between  percentile  and  "Converted  Scores"  equivalents  to  AFQT 
raw  scores  in  the  upper  quarter  of  the  range,  but  up  to  17  points  difference  in  the  lower  half 
and  middle  of  the  range.  This  table  of  "Converted  Scores"  xeplaced  the  percentile  norms 
for  AFQT  until  l December  1951,  when  the  original  percentile  norms  for  AFQT-1  and 
AFQT-2  were  restored.  At  this  time  it  was  expected  that  with  the  assignment  of  pe-nrwy;! 
officers  to  supervise  the  testing  at  Armed  Forces  Examining  Stations,  . iaadard  ng  ^ 
ditions  would  bs  maintained. 

Allocations.  On  1 May  1951,  by  .iicection  of  the  Secretary  of  Defense  the  policy 
qualitative  division  of  military  manpower  accessions  among  the  Services  on  an  equitable 
basis  was  established.  This  policy  applied  to  the  total  of  male  enlistee  and  inductee  acces- 
sions with  certain  exceptions  such  as  officer  candidates,  iviation  cs-';ts,  and  prior  service 
enlistees.  This  policy  provided  that  the  number  of  enlistees  and  inductees  proitured  by  each 
Service  must  conform  to  a fixed  percentage  distribution  among  the  first  four  FQT  mental 
groups.  The  percentages  ox  input  for  each  Service  were  allocated  as  shown  in  ' rble  16, 

R should  be  mentioned  that  the  "Converted  Score"  equivalents  tc  mental  group  limits 
were  the  same  numerically  as  the  Percentile  Score  limits  for  these  mental  groups  which  had 
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Table  16.  Percentage  allocation  by  mental  group  of  enlisted  and  Inducted 
manpower  among  the  Services. 


Mental  Group 

AFQT 

"Converted  Score" 

AFQT 
Raw  Score 

Manpower 

Percentage 

Allocated 

I 

93  - 100 

82-90 

8 

n 

65-92 

71  - 81 

32 

m 

31-64 

57-70 

39 

IV 

13-30 

39  - 56 

21 

been  established  for  inductions  under  SR  615*180-1,  27  April  1950.  Consequently,  Army 
Standard  Score  equivalents  for  the  mental  group  limits  were  at  considerable  variance  with 
the  traditional  uniform  pattern. 

With  the  passage  of  the  Universal  Military  Training  and  Service  Act,  which  lowered  the 
minimum  AFQT  acceptance  score  for  military  service,  the  lower  limit  of  mental  group  IV 
was  changed  to  "Converted  Score"  10  and  allocation  quotas  were  adjusted  in  accordance  with 
a Department  of  Defense  directive  lb  shown  in  Table  17. 


Table  17.  Percentage  reallocation  of  enlisted  and  inducted  manpower 
among  the  Services. 


Mental  Group 

AFQT 

"Converted  Score" 

AFQT 
Raw  Score 

Manpower 

Percentage 

Allocated 

I 

93-100 

82-90 

8 

n 

65-92 

71-81 

31 

m 

31  - 64 

57-70 

38 

• 

IV 

i 

1 

10-30 

34-56 

23 

These  definitions  of  mental  group  limits  and  the  allc  cation  percentage  quotas  were  fur- 
ther adjusted  following  the  studies  Uscussed  in  the  next  Section  of  this  report. 
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FOLLOW-OP  S"UD!t  OF  AFfiT-1  AID  AFQT-2  STANDARDIZATION 


PURPOSE 

Since  there  was  no  controlled  study  which  preceded  the  establishment  of  the  "Converted 
Score"  norms  placed  Into  effect  for  AFQT-1  and  AFQT-2,  questions  were  raised  regarding 
some  of  the  facts  of  the  situation.  The  primary  questions  were:  (1)  To  what  extent  was  there 
"Operational  Slippage"  in  the  originally  established  Aptitude  Area  I standard  score  equivalents 
to  AFQT  scores;  and  (2)  What  is  the  nature  of  the  distribution  of  operationally  administered 
AFQT  scores  and  of  its  relation  to  that  obtained  under  controlled  administration? 

In  order  to  clarify  the  facts  underlying  the  original  standardization  of  AFQT  and  the 
norms  as  shown  in  the  "Converted  Score"  conversion  table,  the  Assistant  Chief  of  Staff,  G-l 
directed  that  a study  be  undertaken  to  compare  and  evaluate  the  test  scores  obtained  at 
recruiting  and  induction  stations  in  relation  to  scores  on  the  same  tests  administered  under 
more  uniform  conditions  of  administration  and  motivation  and  in  relation  to  Aptitude  Area  I 
scores  obtained  in  initial  processing;  so  as  to  determine  whether  any  change  in  existing  AFQT 
score  conversions  is  indicated,  and,  if  so,  what  change  should  be  made. 


GENERAL  PROCEDURE 

The  study  was  designedsc  that  procedures,  duplicating  the  original  AFQT  standardiza- 
tion, would  be  carried  out  separately  for:  (1)  AFQT  scores  obtained  from  induction  and 
recruiting  station  administration;  and  (2)  Scores  on  the  alternate  form  of  AFQT  for  the  same 
men,  obtained  at  training  divisions  under  standardized  conditions.  Standardization  was 
accomplished  by  the  equipercentile  method,  using  Aptitude  Area  I as  a reference  test. 


POPULATION 

The  sample  consisted  of  1 000  men  undergoing  reception  processing  at  lor*. 

New  Jersey;  Fort  Knox,  Kentucky;  Fort  Jackson,  South  Carolina;  and  Fort  Riley,  Kansas, 
during  the  week  of  12  February  1951.  This  sample  was  selected  (from  a total  of  4,961  men 
constituting  the  regular  flow  of  input  at  that  time)  to  duplicate  proportionally,  by  10 -point 
standard  score  intervals  in  R-5,  the  World  War  n population.  It  included  both  enlistees  and 
inductees,  unselected  as  such,  but  drawn  as  shown  in  Table  18. 


Table  18.  Population  components  used  in  study  of  AFQT  "Converted  Scores. " 


j "Convertei 
rt  gar  ding 
snt  was  ther< 
•e  equivalenl 
ministered 


? and  the 
| J Staff,  G-l 
led  at 
ired  under 
ide  Area  I 
‘sisting  AFQ 


standard!  za 
Lon  and 
ar  the  same 
an  was 
test. 


^tDix, 

' Kansas, 
4,961  men 
10-point 
distees  andl 


Fort  Dix  Fort  Knox  Fort  Jackson  Fort  P4ley  Total 


Enlistees  (No.) 

50 

82 

97 

59 

CO 

00 

Inductees  (No.) 

199 

.174 

m 

m 

-111 

Total  (No.) 

249 

256 

255 

240 

1000 

>tal 


712_ 

XX) 


RESULTS 

Comparison  oj\standariitation  Runs,  Results  of  the  original  AFQT  standardization  were 
compared  with  the  follow-up  standardization  results  for  AFQT  using  scores  from  both  induc- 
tion and  recruiting  stations  (called  "operational  conditions")  and  from  standard  adminlstratioa 
at  the  training  divisions  (called  "standard  conditions").  Tabls  19  shows  these  results  in 
terms  of  AFQT  raw  score  equivalents  (computed  by  the  equipercentile  method)  for  certain 
Army  Standard  Scores. 


Table  19.  Comparative  AFQT-1  and  AFQT -2  norms  from  standard  and  operational 
administrations. 


Army 

Standard 

Score 

AFQT  Raw  Score  Equivalents 

Original 
Standardization 
(Standard  Cond.)* 
(1) 

Present  Study 
(Standard  Coni  )** 
(2) 

Present  Study 
(Operational  Cond.  I41* 
(3) 

160 

90 

90 

90 

150 

88 

90 

90 

140 

85 

89 

89 

130 

81 

84 

83 

120 

74 

77 

75 

110 

65 

68 

67 

100 

57 

57 

57 

90 

48 

49 

47 

80 

39 

41 

43 

70 

31 

31 

39 

60 

22 

20 

36 

•Standard  score  equivaletu*  determined  using  R-5  (ARJT)  as  a reference  test. 
••Standard  score  equivalents  determined  using  Aptitude  Area  I as  a reference  test. 


It  can  be  seen  that  this  study  gave  approximately  the  same  norms  for  AFQT  given  under 
standard  conditions  as  were  given  in  the  original  standardization  study. 

>_  Comparison  of  AFQT  Scons  Obtained  from  Operational  and  from  Standard  Test  Administration. 

However,  it  may  also  be  seen  in  Table  19  that  there  were  more  pronounced  differences  In 
| aFQT  standard  score  equivalents  between  the  original  standardization  study  and  results 

obtained  from  AFQT  tests  given  under  operating  conditions.  The  AFQT  equivalents  to  siand- 
• ard  scores  of  80  and  below  were  considerably  higher  for  op-rational  scores  than  they  wer  in 
the  original  standardization.  Particularly,  a standard  score  of  70  increased  from.  AIQT  raw 
score  31  to  raw  score  39;  and  a standard  score  of  60  increased  from  raw  scare  2?  to  raw 
score  36.  Standard  score  convert  ous  for  AFQT,  then,  did  not  hold  up  at  levels  below  stand- 
score  80  for  AFQT  scores  obtained  under  operational  administration  conditions. 
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The  most  dramatic  evident...  of  such  inflation  at  the  lower  end  of  the  distribution  vas 
shown  in  the  comparisons  of  raw  score  distributions  for  AFQT  administered  under  opera- 
tional and  under  standard  conditions  {see  Figure  3}.  The  distribution  of  scores  from  standard 
administration  is  relatively  smooth,  though  negatively  skewed.  This  is  a typical  unselected 
sample  distribution  for  AP^T.  The  skewness  h is  been  built  into  the  test,  since  it  was  con- 
structed to  be  more  differentiating  at  lower  portions  of  the  scale.  The  distribution  far  opera- 
tionally obtained  AFQT  scores  is  markedly  binodal,  and  most  abnormally  modal  at  the  inter- 
val containing  raw  score  39,  the  currert  Army  cut-off  score.  The  absence  of  scores  below 
39,  of  course,  is  due  to  applications  of  this  cut-off,  plus  the  fact  that  administratively 
accepted  Inductees  were  not  included  in  this  study. 

" Administratively  accepted  inductees"  refer?  *o  men  who  were  inducted  because  their 
failure  on  AFQT  was  interpreted  to  be  motivational  rather  than  genuine  lack  of  ability.  High 
school  graduates  who  failed  were  administratively  determined  by  the  commanding  officer  of 
the  examining  station  to  have  met  the  mental  requirements.  Other  failures  were  interviewed 
and  administrative  determination  was  made.  A Terminal  Screening  Guide  was  prepared  at  a 
later  date  to  serve  as  an  aid  to  the  interviewer  by  providing  suggestions  as  to  the  type  of 
additional  information  he  might  find  useful  in  arriving  at  his  decision  (e.  g. , job  history,  edu- 
cational history,  ability  to  drive  a car,  instructions  for  administering  the  Individual  Examina- 
tion). 


The  obvious  interpretation  of  these  graphs  is  that  most  of  the  scores  around  39  which 
are  obtained  at  induction  and  recruiting  stations  represent  scores  for  men  whose  scores  in 
AFQT  obtained  under  standardized  conditions  really  are  lower. 

Scatterplots  of  operational  administration  and  standardized  administration  AFQT  scores 
showed  this  to  be  true.  The  distribution  of  standardized  administration  scores  of  those  in  the 
interval  containing  operationally  administered  raw  score  39  predominantly  ranged  below  raw 
score  39. 

The  above  mentioned  inaccuracy  in  operational  AFQT  scores  at  the  lower  end  of  the 
distribution  occurs  in  scores  reported  from  both  induction  and  recruiting  stations.  Figures 
J-l  and  J-2  in  Appendix  J show  e comparative  distributions  of  standard  admir*su 
scores  and  operational  administration  scores  for  both  enlistees  and  inductees.  Again  the 
abnormal  mode  was  obtained  at  the  operational  score  interval  containing  the  cut-off  score  of 
39  for  each  group,  fa  the  enlistee  operational  distribution  29%  of  scores  were  in  thte  inter- 
val as  were  16. 5%  of  the  inductee  scores,  showing  that  the  abnormality  was  more  pronounced 
for  enlistees.  Separate  scatterplots  of  operational  and  standard  administration  AFQT  scores 
for  enlistees  and  inductees  also  demonstrated  that  operational  scores  reported  at  ?»*d  : ..... 
ately  above  the  cut-off  score  39  are  predominantly  for  both  enlistees  ana  inductee? 
standard  administration  scores  were  at  failure  levels. 


CONCLUSIONS 

The  major  conclusions  based  on  the  results  of  the  comparison  o.  aFQT  scores  obtaiwid 
under  operational  conditions  and  under  standard  conditions  were: 

1.  "Operational  Slippage"  occurred  in  the  operational  scores  as  indicated  cy  the  pile- 
up  of  scores  at  the  cut-point,  for  both  enlistees  and  inductees. 

2.  Under  standard  .administration,  the  pile-up  at  the  cut-point  was  absent  and  the  ori- 
ginal standardization  percentiles  were  obtained. 

3.  The  cc  - - >ction  of  operating  conditions  of  test  administration  appears  to  be  the  solu- 
tion for  avoiding  t ue  discrepancies  which  appear  fa  operati  jal  AFQT  scores. 


-27- 


IVALUAT.CN  or  afqt  conversion  tables 


I 


► • 


Along  with  the  expansion  of  military  strength,  there  was  an  increasing  interest  on  the 
part  of  the  Department  of  Defense  in  manpower  problems.  This  was  evidenced  by  a 
, strengthening  of  the  organization  of  the  Office  -of  the  Assistant  Secretary  of  Defense,  Man- 
power arid  Personnel,  to  provide  direct  channels  for  dealing  with  inter-service  manpower 
questions.  area  of  recruitment  and  induction  such  a channel  was  provided  by  the 

organization  of  a system  of  Armed  Forces  Examining  Stations  in  1951  to  deal  with  questions 
of  procurement,  selection,  arid  allocation  of  military  manpower.  Beginning  1 July  1951,  the 
examining  functions  of  Army,  Navy,  Air  Force,  and  Marine  Corps  Main  Pcecruiting  Stations 
were  consolidated  and  responsibility  far  these  transferred  to  Armed  Forces  Examining  Sta- 
tions. By  the  end  of  1951  there  were  about  75  such  Nations  throughout  the  United  States. 
Responsibility  for  development  of  policy  applicable  to  these  stations  was  vested  with  the 
"Armed  Forces  Examining  Station  Policy  Board"  (AFES  P3).  This  Board  was  established 
within  the  Office  of  the  Assistant  Secretary  of  Defense,  Manpower  and  Personnel.  It  was  com- 
posed of  the  Director,  Manpower  Utilization,  Office  of  the  Assitant  Secretary  of  Defense,  Man- 
power and  Personnel,  as  Chairman,  and  one  general  or  flag  officer  from  each  of  the  four  Ser- 
vices. This  Board  designated  the  Department  of  the  Army  as  executive  agent  for  administra- 
tion of  the  various  Armed  Forces  Examining  Stations,  though  staffing  of  the  stations  included 
personnel  from  all  Services.  ‘ 

One  of  the  primary  problems  of  the  AFES  Policy  Board  at  this  time  was  that  of  alloca- 
tion of  military  accessions  (recruits  end  inductees)  to  the  four  Services.  As  was  pointed  out 
in  the  previous  discussion  on  operational  application  of  AFQT,  such  accessions  were  allocated 
on  the.basis  of  distributions  among  the  mental  groups  I,  n,  HI,  and  IV  as  determined  by 
AFQT.  This  allocation  began  on  1 May  1951.  However,  considerable  concern  arose  immedi- 
ately over  the  fact  that  in  experience  of  the  Services  during  May  and  June  1951,  the  obtained 
percentage  distributions  of  accessions  for  the  four  mental  groups  differed  appreciably  from 
the  predicted  percentages  as  prescribed  in  the  allocation  formula.  Table  20  shows  the  pre- 
dicted (prescribed)  percentages  for  allocation  and  the  obtained  percentages  of  all  Services' 
input  in  the  four  AFQT  mental  groups. 


Table  20.  Percentages  Of  accessions  in  AFQT  Mental  Groups — 1951. 


Mental 

Group 

(1) 

"Converted 
Score"  Range 
(2) 

Prescribed  Quotas 
2 April  Directive 
(3) 

Actual  Distribution 
May  June 

(4)  (5) 

I 

93-100 

8.0 

5.5 

6.4 

n 

65-92 

32.0 

16.8 

17.9 

m 

31-64 

39.0 

27.6 

30.8 

IV 

13-30 

21.0 

50.1 

44. 9 

Total 
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The  prescribed  quotas  shown  in  column  (3)  were  deemed  to  be  an  adequate  prediction  of 
the  distribution  < ' s otal  Input,  based  on  the  assumption  that  the  "Converted  Scores"  represent 
percentile  norms.  Therefore,  any  mental  group  should  cci-rtn  the  proportion  of  total  input 
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equal  to  the  range  of  "Converted  Scores"  i the  group  divided  by  the*  total  range  of  acceptable 
scores  (i.e. , range  in  mental  group  H equals  65-92  cr  28;  total  acceptable  range  equals 
13-100  pr  88;  therefore  expected  percentage  in  group  n equals  28/88  or  32  as  shown  in 
column  (3)).  When  the  obtained  distribution  for  May  and  June  did  not  resemble  the  prescribed 
quotas,  the  A1  PS  Policy  Board  raised  the  question  cf  the  accuracy  of  the  AFQT  norm  table. 

To  answer  this  question  the  AFES  Policy  Board  appointed  a "Working  Group  on  Evalua- 
tion of  the  AFQT  Conversion  Table"  to  evaluate  the  correctness  of  the  norms.  This  working 
group  consisted  of  one  research  psychologist  from  each  of  the  four  Services.!/ 

This  working  group  recognized  the  differences  between  the  no:  ms  for  AFQT  as  origi- 
nally established  in  the  original  percentile  norms  and  as  established  in  the  revised  "Converted 
, Score"  norms.  It  also  took  cognizance  of  the  fact  that  the  "Converted  Score, " although  it 
. , represents  a partial  correction  for  non-standard  operational  test  administration,  does  not 
define  percentile  norms  for  AFQT.  Therefore,  using  the  original  standardization  percentile 
equivalents  to  AFQT  "Converted  Scores- " it  was  shown  that  a predicted  distribution  similar 
to  that  obtained  in  May  and  June  wmxld  result. 

ft 

The  working  group  recommended  that: 

1.  Test  administration  practices  in  Armed  Forces  Examining  Stations  be  improved  to 
attain  standard  conditions. 

2.  When  such  conditions  are  attained,  the  original  percentile  norms  be  used  to  replace 
"Converted  Score"  norms  in  determining  allocation  quotes. 

| 3.  Based  on  these  percentile  scores  under  the  current  standards  for  induction  (minimum 

«'  acceptable  score  at  the  10th  percentile),  the  quotas  for  the  four  mental  grades  should  be  as 
shown  in  Table  21. 


Table  21.  Expected  percentages  of  accession  under  siaxiuard  testing  condition.®. 


Mental  Grade 

Percentage  Quotas 

I 

9 

n 

31 

TTT 

37 

IV 

23 

Total 

» ^ 

4/  Tb**  members  were:  Army — Dr.  Julius  E.  Uhlansr;  Navy— Dr.  Kenneth  E.  Clark;  Marine 
Corps-rMr.  Francis  F.  Medland;  Air  Force— Dr.  Charles  C.  Limburg,  Chairman. 


Operational  action  was  ta*.en  almost  immediately  to  effect  improvement  of  conditions 
highlighted  by  these  studies.  Two  major  actions  had  a direct  bearing  on  the  above  recommen- 
dations: 

The  administration  of  mental  testing  was  pliicsd  under  the  supervision  of  commissioned 
personnel  psychologists  assigned  to  the  AFES.  The  Army  set  up  a special  training  program 
at  the  Adjutant  General's  School  in  which  personnel  psychologists  (MOS  2230)  were  given  a 
two  - week  intensive  course  specific  **  AFES  mental  examining  procedures.  Emphasis  was  fl 

placed  in  this  course  on  the  necessity  for  good  test  administration  to  control  "Operational  a 

Slippage"  in  AFQT.  Each  AFES  was  subsequently  assigned  at  least  one  of  these  trained  per-  I 
sonnel  psychologists.  Preliminary  evidence  points  in  toe  direction  of  considerable  improve-  i 
ment,  1 


The  AFES  Policy  Board  re-instituted  toe  original  AFQT  percentile  conversion  table 
1 December  1951,  to  replace  toe  "Converted  Score"  table. 


081  OP  AFQT  BY  THI  ARMY,  NAVY,  MARINI  CORPS,  AND  AIR  FORGE 

The  Armed  Forces  Qualification  Test  is  currently  used  as  an  initial  screening  instru- 
ment by  all  four  military  Services.  The  various  uses  to  which  the  test  is  put  by  the  Services 
are  summarized  below.  All  cutting  scores  were  administratively  determined. 


ALL  SERVICES 


k 


* 


1.  The  AFQT  is  used  for  determination  of  acceptability  for  enlistment  on  toe  basis  of 
mental  qualifications.  The  minimum  acceptable  score  on  AFQT-1  or  AFQT-2,  beginning 
15  July  1951,  is  percentile  score  10. 

2.  The  AFQT  is  used  for  determination  of  acceptability  for  induction  of  Selective  Ser- 
vice registrants  on  the  basis  of  mental  qualifications  whenever  any  Service  usee  tor  *nd"C$i*u 
machinery  for  procurement  of  ; jrsonnei.  The  minimum  acceptable  score  on  AFQT-1  and 
AFQT-2,  beginning  15  July  1951,  is  percentile  score  10. 

3.  In  induction  screening,  an  additional  use  of  AFQT  Is  made  for  classifying  those 
registrants  who  fail  to  achieve  the  minimum  score.  For  this  purpose,-  the  answer  sheets  of 
AFQT  failures  are  rescored  with  toe  AFQT  Verbal-Arithmetic  Key,  which  provides  ? ricrW-3 
minus  one-third  wrongs  scare  on  the  first  30  verbal  and  arithmetic  ite~  . Those  'i' 
achieving  such  a score  of  6 or  higher  are  designated  as  not  acceptable,  but  are  placed  in  a 
deferred  category  for  possible  future  induction. 

4.  Equitable  distribution  of  military  accessions  (allocation)  is  accomplished  by  per- 
centage quotas  of  total -chargeable  accessions  within  each  Service  based  on  distribution  of 
men  in  AFQT  mental  groups  I through  IV, 


ARMY 


1.  The  AFQT  is  used  tor  determining  acceptability  of  women  tor  enlistment  in  toe  WAC 
on  the  basis  oi  mental  qualifications.  The  minimum  AFQT  acceptable  score  is  percentile  score 
31. 


i 
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2.  Applicants  for  enlistment  from  ivilian  life  as  officer  candidates  are  screened  with 
AFQT  to  determine  mental  qualification.  The  mininrim  acceptable  score  on  AFQT  is  per- 
centile score  65. 


Mental  qualifications  of  women  enlistee  anplicants  for  the  WAVE  is  determined  by  the 
AFQT.  The  minimum  acceptable  score  is  percentile  score  37. 


MARINE  CORPS 


Vfomen  applicants  for  enlistment  in  the  Marine  Corps  are  screened  with  AFQT.  The 
minimum  acceptable  score  is  percentile  score  37. 


AIR  FORCE 


Women  applicants  for  enlistment  in  the  WAF  are  screened  with  AFQT.  The  minimum 
acceptable  score  is  percentile  score  49. 


SUMMARY  OF  PROGRAM 


This  program  was  inaugurated  to  develop  screening  tests  which  would  provide  a basis 
for  mental  standards  of  acceptance  of  recruits  and  inductees. 


1.  In  response  to  the  Army's  need  for  screening  tests  (as  distinguished  from  classifi- 
cation tests),  R-2,  R-3,  and  R-4  were  constructed.  These  tests  were' first  used  at  Central 
Examining  Stations  for  final  screening.  Following  the  development  of  R-5  and  R-6  and  their 
introduction  operationally  at  Central  Examining  Stations,  R-3  and  R-4  were  available  for  use 
by  local  recruiting  stations  as  initial  screening  tests. 


2.  To  meet  the  need  for  greater  uniformity  among  the  Services  in  mental  screening 
procedures.  The  Armed  Forces  Qualification  Tests  (AFQT),  Forms  1 and  2 were  developed 
as  a joint  effort  of  Army,  Navy,  Air  Force,  and  Marine  Corps  researen  personnel,  with  the 
Department  of  the  Army  acting  as  coordinating  agency.  The  AFQT  is  used  operationally  by 
the  Army,  Navy,  Air  Force,  and  Marine  Corps  for  determining  mental  qualifications  of  ma'e 
and  female  enlistee  applicants  and  for  screening  Selective  Service  registrants  for  induction 
by  those  Services  which  may  so  procure  personnel.  The  AFQT  also  provides  the  basis  for  the 
Department  of  Defense  system  of  qualitative  distribution  of  military  accessions  among  tiie  four 
Services. 


3 .  Each  of  the  two  forms  of  AFQT  contains  90  items  divided  equally  among  vocabulary, 
arithmetic  reasoning,  and  spatial  relations.  Items  were  selected  on  the  basis  item  analyses 
so  as  to  provide  a spread  of  difficulty  over  the  entire  useful  range.  To  provide  comparable 
forms,  items  were  matched  not  only  for  validity  and  difficulty  but  for  similarity  in  content  md 
psychological  process  as  well. 


4.  Standardization  of  AFQT  was  based  on  samples  c4  tho  entire  military  population  on 
duty  *n  all  the  Services  as  of  31  D&:.;mber  1944.  The  two  forms  were  standardized  separately. 
The  differences  in  the  distribution  of  scores  on  the  two  forms  were  so  slight  that  a single  con- 
version table  was  adopted.  By  means  of  equipercentile  conversion,  scores  were  translated 
into  standard  forms. 


5 . AFQT  scores  were  found  to  be  highly  correlated  with  scores  on  Army  aptitude  tests 
such  as  AGCT  and  its  successor  Aptitude  Area  L 

6 . It  was  found  that  distributions  of  scores  obtained  from  operational  administration 
differed  significantly  from  the  distribution  of  scores  expected  on  tile  basis  of  the  standardi- 
zation studies.  A follow-up  study  substantiated  the  original  standardization.  To  reduce  this 
"Operational  Slippage"  steps  were  talien  to  control  test  administration  at  Armed  Forces 
Examining  Stations.  Preliminary  evidence  points  in  the  direction  of  considerable  improvement 

7.  Other  studies  have  been  and  still  are  being  made.  One  series  uf  studies  has  resulted 
in  the  construction  of  "motivation  keys"  to  control  the  effect  of  attempts  to  distort  cr  bias  test 
scores.  Another  series  is  directed  at  developing  nonverbal  forms  of  mental  screening  tests. 
Research  efforts  will  continue  to  be  directed  toward  maintaining  and  improving  screening  tests 
in  accordance  with  improvements  in  test  construction  techniques,  administrative  policies,  and 
operating  problems. 

COLLECTION  OF  DATA:  February  1946  to  November  1951 
PREPARATION  OF  REPORT:  1 May  1952 
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APPENDIX  A 


CHRONOLOGY  OF  SCREENING  TESTS  AND  STANDARDS 
(Tests  R-l  through  AFQT-1  and  AFQT-2) 


18  April  41:  Regulations  excluded  from  military  service  all  men  who  did  not  have  the  capacity 
for  "reading  and  writing  the  English  language  as  commonly  prescribed  for  the 
fourth  grade  in  grammar  school. " (Changes  to  Mobilization  Regulation  1-7, 

18  April  1941) 

1 June-  42:  Induction  was  permitted  of  men  who  could  not  meet  the  literacy  standards  pro- 
vided they  possessed  "sufficient  intelligence  to  absorb  military  training  rapidly. " 
(V/D  Circular  189,  1 June  J 942) 


31  Oct  42:  R-l  introduced  as  screening  test  for  induction  of  limited  service  (physically 

restricted)  personnel.  Minimum  acceptable  score  is  standard  score  90.  (TW%, 
Symbols  OC-S-WDGAPO,  31  Oct  1942) 


Nov  45:  Inductions  stopped.  Procurement  by  enlistment  only  begun. 


7 Feb  46:  G-l  directed  that  a test  be  constructed  for  use  in- Army  enlistment  which  could 
be  administered  individually  or  in  groups  with  a minimum  of  verbal  instruction, 
could  be  completed  in  less  than  30  minutes,  and  would  provide  scores  as  nearly 
equivalent  to  AGCT-3a  standard  scores  as  possible  within  the  lower  test  score 
ranges.  (D/F  from  WDGS,  G-l,  File:  WDGAP  342.06,  7 February  1946) 


17  April  46:  Raw  score  ol  15  on  R-l  (standard  score  70)  Is  minimum  acceptable  for  enlist- 
ment. (WD  Circular  110,  17  April  1940) 


12  June  4G:  R-l  used  at  local  recruiting  stations,  where  raw  score  of  15  is  mhiii-i-m  accept- 
able for  referral  to  Central  Examining  Station,  where  raw  score  16  an  R-2 
(standard  score  70)  is  minimum  acceptable  for  enlistment.  ’ R-2  crarr^rred  Lvwu 
Main  Recruiting  Stations  to  Central  Examining  Stations.  (WD  Circular  171, 

12  June  1946) 


9 Aug  46:  R-l  and  R-2  used  at  local  recruiting  stations  where  raw  scores  of  13  on  R-l  or 
16  on  R-2  (standard  score  70  on  either)  are  minimum  acceptable  for  referral  to 
Central  Examining  Stations,  where  raw  score  of  6 on  R-3  or  R-4  (standard 
score  70)  is  minimum  acceptable  for  enlistment.  (WD  C*~  alar  239,  - .. 

1946) 


23  April  47:  Raw  score  of  19  on  R-l  or  23  on  R-2  (standard  score  80  on  either)  are  mk’m’ts. 

acceptable  at  local  recruiting  stations  for  referral  to  Central  Examining  Sta- 
tions, where  raw  score  of  13  on  R-3  or  R-4  (standard  scots  80)  is  minimum 
acceptable  for  enlistment.  R-2  is  to  replace  R-l  as  sof  ^>  as  stocks  of  R-l  are 
depleted.  (WD  Circular  103,  23  April  1947) 


11  Feb  48:  Examining  functions  are  transferred  -'rom  Central  Examining  Stair  to  Main 
Recruiting  Stations.  Use  of  R-5  and  R-6  is  prescribed.  (Memo  600-750-28, 
11  February  1948) 


12  March  48:  Minimum  acceptable  score  for  enlistment  at  Main  Recruiting  Stations  is  standard 
score  80  on  R-3,  R-4,  R-5,  or  R-6.  (WD  Circular  66,  12  March  1945) 


30  March  48:  Standards  are  the  same  as  i.  WD  Circular  66.  Tests  R-3  or  R-4  is  given  at 
local  recruiting  stations  and  R-5  or  R-a  is  given  at  Main  Recruiting  Stations. 

(Memo  600-750-30,  30  March  1948) 

27  July  48:  Minimum  score  acceptable  lor  enlistment  is  standard  score  70  on  R-3  or  R-4, 
but  80  on  R-5  and  R-6.  (Memo  600-750-30,  2r.  July  1948) 

Nov  48:  Inductions  began  under  Selective  Service  Act  of  1948.  Inductions  continued  lor 
3-month  period,  (ilovembeg  1948— January,  1949)  and  were  than  terminated 
until  August  1950.  Mlnimum  acceptance  standards  on  R-5  and  R-6  '■sme  as  far 
enlistment  (standard  score  70  GCT  equivalent  included  in  Selective  Service  Act 
of  24  June  1948,  FL  759-80th  Congress-as  the  minimum  acceptable  score). 

h'j  Dec  48:  Minimum  acceptable  score  at  initial  recruiting  poini  is  raw  score  9 on  R-3  or 
R-4  (standard  score  75)  for  referral  to  Main  Recruiting  Stations  (where  stand- 
ard score  70  on  R-5  or  R-6  is  minimum  acceptable  for  enlistment).  (WCL 
3S081,  30  December  1948) 

1 Jan  50:  AFQT  Forms  1 and  2 to  replace  R-5  and  R-6  at  Main  Recruiting  Stations.  At 
local  recruiting  stations  R-3  and  R-4  continue  in  use.  (SR  615-105-25, 

13  December  1949) 

1 Jan  50:  R-3  and  R-4  used  at  local  recruiting  stations  where  raw  score  of  17  or  higher 

on  either  (standard  score  90)  is  minimum  a'  ■•'eptable  for  referral  to  Main 
Recruiting  Stations  where  a percentile  score  7t  31  (standard  score  90)  on 
AFQT-1  or  AFQT-2  is  minimum  acceptable  for  enlistment.  (TAG  letter.  File: 

AGSE  342,  22  December  1949) 

27  April  50:  AFQT  Forms  1 and  2 to  be  used  at  Joint  Examining  and  Induction  Stations  for 
screening  Inductees.  Percentile  score  13  (standard  score  70)  to  be  minimum 
acceptable  for  induction.  (SR  615-180-1,  27  April  1950) 

10  July  50:  A new  conversion  table  of  -aw  scores  into  "Converted*  Scores"  was  prepa:- - 
the  direction  of  Army  G-l  and  placed  into  operation  10  July  1950.  (Letter  from 
TAG  to  all  Army  Commands,  File:  AGPP-P  220.01,  10  July  1950) 

17  July  50:  For  enlistments,  raw  score  6 on  R-3  (standard  score  70)  is  minimum  acceptable 
at' local  recruiting  stations  for  referral  to  Main  Recruiting  Stations  where  per- 
centile score  of  13  (standard  score  70)  is  minimum  acceptable  for  enlistment. 

(WCL  32938,  TAG,  17  July  50) 

1G  July  50:  For  enlistment,  standards  at  local  recruiting  stations  unchanged.  At  Mein 

Recruiting  Stations,  "Converted  Score"  13  on  AFQT-1  or  AFQT-2  (standard 

score  70  adjusted  for  operational  administration)  is  minimum  acceptable  for  ? 

enlistment.  (WCL  33372,  TAG,  18  July  1950)  "Converted  Score"  table 

replaced  percentile  score  equivalent  table  for  determining  AFQ”'  1 and  AFQT-2 

norms  to  adjust  for  slippage  in  operational  tasting  conditions  for  enlistment. 

• * 

Aug  5C:  Inductions  begin  for  Army  under  30  June  195o  Extension  of  Selective  Servk  Act 
of  1948.  (PL  599-81st  Congress) 

2 Nov  50:  "Converted  Score"  is  minimum  acceptable  score  cn  AFQT-1  and  AFQT-2 

(adjusted  standard  score  70)  for  Induction.  (SR  615-130-1,  Change  3, 

2 November  1950) 
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11  Jan  51:  G-l  directed  that  a bidy  be  undertaken  to  compare  test  scores  obtained  .>nder 
operational  and  under  standard  conditions  so  as  to  check  the  existing  AFQT 
Score  conversions. . (DF  from  G-l  to  TAG,  File:  G-l  201. 6,  11  January  1951) 


2 April  51:  By  direction  of  the  Secretary  of  Defense,  the  policy  of  qualitative  division  o i 
military  manpower  accessions  among  <he  Services  on  an  equitable  basis  was 
•established.  {Memorandum  for  Secretaries  of  Army,  Navy,  and.  Air  Force, 
and  Joint  Chiefs  of  Staff,  Subject:  "Qualitative  Distribution  of  Military  Man- 
power, " 2 April  1951) 

19  June  51:  Percentile  score  of  10  (standard  score  65)  or  AFQT  established  as  minimum 
acceptable  for  induction.  (PL  51,  82nd  Congress,  amendment  to  Universal 
Military  Training  and  Service  Act) 

30  June  51:  "Converted  Score"  of  10  on  AFQT-1  and  AFQT-2  (adjusted  standard  score  65) 

is  minimum  acceptable  for  induction.  vDepartment  of  Defense  Directive  100.03-1, 
30  June  51) 


18  July  51:  "Converted  Score"  of  10  on  AFQT-1  and  AFQT-2  (adjusted  standard  score  65)  Is 
minimum  acceptable  for  enlistment  at  Main  Recruiting  Stations.  (DA  Radio  34878, 
TAG,  18  July  1951) 

30  Oct  51:  Examining  functions  for  recruitment  and  induction  by  all  Services  transferred  to 
Armed  Farces  Examining  Stations.  Commissioned  personnel  psychologists  were 
assigned  to  supervise  administration  of  testing.  (SR  615-100-1,  30  October  1951) 

5 Nov  51:  For  induction,  AFQT-1  and  AFQT-2  supplemented  by  additional  screening  with 
the  AFQT  Verbal-Arithmetic  Subtest,  Non-Language  Qualification  Test  (NQT-D. 
This  supplemental  screening  given  AFQT  failures  to  classify  them  for  possible 
future  induction.  (SR  615-180-1,  5 November  1951) 

23  Nov  51:  Percentile  score  of  10  on  AFQT-1  and  AFQT-2  (standard  score  65)  is  minimum 
acceptable  for  both  enlistment  and  induction.  The  "Converted  Sccr  fcr 

determining  AFQT'  norms  is  replaced  by  the  original  percentile  norm  table,  since 
standard  testing  conditions  are  assumed  to  have  been  achieved  in  examining  sta- 
tions. (DA  Radio  46247,  TAG,  23  November  1951) 
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APPENDIX  B 


N's,  MEANS,  STANDARD  DEVIATIONS  AND  PRODUCT-MOMENT  CORRELATIONS 
FOR  INDICATED  COMBINATIONS  OF  R-2.  R-3,  R-4,  AND  AGCT-3a 

Camp  Atterbury  a.  eptlon  Center,  May  1948 


Form  Population* 


R-2 


R-3x  B 

R-3x  D 


Rx-Test 
N M SD 


AGCT-3a 
M SD 


700  34.4  13.2  98.8  20.5 


700  22.1  10.9  98.9  20.0 

1000  22.5  11.4  98.8  20.1 


600  22.1  10.7  99.0  20.6 

1000  25.6  11.3  98.8  20.1 


*T»ats  adalnlatered:  A - R-2.  AGCT-3" 

B - R-3*.  AQCT-3* 

C - R-4*.  AOCT-3* 

D • R-3*.  R-4*.  AQCT-3* 


r Between 
Rx-Test  and 
AGCT-3a 

. 

.79 

.86 

.83 

.80 

.86 

r Between 
R-3x  and  P-4x 

.80 

-40- 


r ^ 

, » 


* * A . 


T.i>- 


APPENDS  C 


CONVERSION  TABLES:  RAW  SCORES  ON  R-2,  R-3,  AND  B-4  TO  ARMY 

STANDARD  SCORES 


K 


APPENDIX  E 


BREAKDOWN  OF  TEST  POPULATION  FOR  GCT-7x  AND  GCT-8x 
BY  DIVISION,  STATION,  AND  STATUS 


GCT-8X 

Rejects  Recruits  Total 


Recruiting  Stations 

Army-Air  Force 

Navy 

Total 


visions 


Army  (Dlx) 

Air  Force  (Lackland) 
Navy  (San  Diego) 
Total 


Combined  Recruiting 
Stations  and  Training 
Divisi 


Em 


Army 
Air  Force 
Navy 
Total 


Rejects 

GC  ; -7x 
'F^cruits 

Total 

158 

430 

588 

129 

129 

278 

430 

717 

1125 

772 

$61 

2758 

1713 

772 

3475 

t 

L49 

456 

005 

r 

L34 

134 

Afpaoix  r 


Tabl*  7-1.  Saato  it*  »iuti»tie*  of  it«c  in  final  fozm  of 


AW 

top*rl**nWl 

■ ros 

Difficulty^ 

"iSu 

& 

Tilling 

(UhtUUI 

CcacUtcnej^ 
IUn-TMt 
Self  Corclatiooc 

latcpcoa 

COM  Oocffloi*: 
ArltUmtla 

OTtmuV 

*o. 

fora 

>0. 

8 

sight 

*af  3 BEI  II 

»WU; 

YooAbultry 

»—  oping 

KcUtlocc 

InAc 

i-a 

(practio#  fum) 

.66 

.86 

9 

a-T 

56 

1 

1 

1 

95 

-5b 

.62 

.91 

.44 

.76 

•32 

2.53 

10 

a*-T 

tt 

1 

1 

2 

96 

.38 

A7 

•1? 

1.12 

•97 

.44 

•a 

•3b 

2.46 

n 

ai-T 

20 

1 

1 

1 

95 

.b6 

M 

.77 

.92 

.44 

.61 

.14 

2-37 

12 

£8c-t 

2b 

2 

2 

3 

93 

-78 

.» 

.7 

1.13 

.99 

.44 

.79 

.20 

3.18 

13 

a-r 

29 

0 

2 

2 

92 

58 

.62 

•«> 

1.03 

.a 

.M 

.76 

•V> 

2-57 

lb 

a-T 

to 

2 

2 

2 

91 

.68 

.50 

.61 

.91 

1.0b  - 

.44 

.65 

.33 

2.76 

15 

71 

lb 

1 

1 

2 

68 

.29 

•by 

.66 

.80 

.71 

.42 

.44 

.49 

'.02 

16 

7X-A 

9 

1 

1 

2 

92 

.26 

.56 

.'rt 

.61 

.7b 

.54 

.44 

.74 

1.68 

17 

7X-A 

12 

2 

1 

1 

87 

.bo 

.26 

.63 

.b9 

•58 

.20 

.44 

.28 

1.68 

IS 

a^A 

35 

2 

3 

3 

85 

.58 

.62 

.71 

•61 

63 

.69 

.44 

.46 

2.22 

19 

7X-A 

27 

2 

3 

s 

a 

.39 

.VS 

•S3 

-65 

.62 

.42 

.44 

.40 

1.75 

20 

7X-A 

23 

3 

1 

4 

78 

•52 

.bl 

.68 

.65 

•39 

.44 

.47 

1.92 

21 

7X<6 

6 

1 

l 

1 

88 

.44 

.10 

.7b 

■59 

.22 

.49 

.44 

1.45 

22 

7X-0 

56 

3 

1 

Z 

79 

.b2 

.63 

.70 

.61 

.32 

.44 

1.24# 

*3 

.71*6 

10 

1 

1 

3 

79 

.4 

-35 

.37 

.65 

•57 

.28 

.44 

.44 

1.82 

2b 

7X*fl 

60 

2 

1 

1 

81 

-33 

•*3 

.67 

•57 

.21 

.37 

.44 

1.24 

25 

7x*a 

26 

3 

2 

2 

79 

.26 

•59 

•58 

•55 

.20 

.42 

.44 

1.00 

26 

ua 

13 

1 

1 

2 

78 

•• 

.27 

.11 

.56 

•59 

.19 

.39 

.44 

1.25 

IT 

a.t 

55 

3 

1 

3 

91 

.6b 

•50 

.63 

-95 

.90 

.74 

.29 

2.51 

26 

7X-Y 

5? 

2 

3 

.1 

30 

.61 

.66 

•7» 

•99 

•92 

.44 

.79 

.60 

2.43 

29 

7X-Y 

51 

3 

3 

t 

e> 

.61 

.61 

•37 

.76 

■79 

.44 

.59 

.58 

2.17 

30 

7X-Y 

83 

3 

2 

t 

79 

•53 

.b8 

.38 

.73 

.82 

.*4 

•5b 

.47 

2.13 

31 

a-? 

77 

3 

3 

t 

79 

-58 

-59 

.61 

.89 

•79 

.57 

.42 

2.47 

32 

a-t 

73 

t 

t 

t 

78 

.68 

.66 

•71 

.a 

.8b 

.44 

•72 

.46 

2.60 

33 

* aux 

12 

2 

2 

3 

8b 

.bl 

•36 

.53 

.56 

.bl 

.48 

.44 

.23 

1.58 

£ 

71 -A 

16 

1 

3 

2 

a 

.ta 

-51 

.17 

.67 

.63 

.47 

.44 

.44 

1.6 

35 

7X-A 

29 

3 

t 

3 

a 

.td 

.ta 

• tt 

.71 

.6b 

.42 

.44 

.51 

1.82 

36 

71  *A 

15 

2 

3 

t 

81 

.58 

.66 

.61 

.61 

•72 

-5b 

.44 

.55 

2.29 

37 

a .a 

38 

2 

3 

76 

.50 

»b2 

.31 

.66 

-50 

-53 

.44 

•35 

1.71 

30 

71-A 

38 

3 

5 

t 

71 

.39 

-51 

.62 

•72 

-71 

.43 

.44 

.45 

2.07 

39 

71-8 

58 

2 

2 

3 

73 

-• 

-39 

•31 

•59 

•55 

.29 

.33 

.*• 

1.20 

to 

71-8 

31 

5 

2 

t 

69 

»• 

-38 

•3* 

.53 

.55 

•23 

•39 

.44 

1.16 

ti 

7X-8 

39 

1 

t 

t 

70 

»• 

•b2 

.33 

.63 

.62 

.32 

.37 

.44 

1.31 

b2 

a-8 

33 

5 

t 

t 

66 

.# 

•30 

•tt 

.bl 

.39 

.35 

.36 

•«* 

0.61 

t3 

a -8 

to 

3 

5 

5 

f3 

«• 

-b3 

.17 

.57 

.59 

-3b 

.39 

.44 

1-33 

tt 

7X-8 

b7 

t 

t 

5 

62 

.to 

.12 

.62 

.6b 

•3b 

.49 

.44 

1.25 

t5 

a-r 

35 

3 

t 

5 

73 

.50 

.57 

.68 

.73 

-73 

.44 

.60 

•39 

2.22 

t6 

TX-Y 

39 

t 

t 

5 

73 

-b5 

.52 

.61 

.65 

.90 

.44 

52 

.41 

2.10 

br 

a.Y 

7b 

t 

t 

t 

69 

.50 

-52 

•62 

-70 

.63 

.44 

.58 

.36 

2.03 

t8 

a-r 

83 

0 

5 

5 

66 

.62 

•5b 

.59 

.a 

.66 

.44 

.64 

.40 

2.18 

* t9 

awr 

a 

5 

6 

t 

66 

-59 

-63 

43 

.a 

•70 

.44 

.65 

.ta 

2.27 

S 50 

a-Y 

60 

6 

t 

5 

63 

.t6 

.46 

.66 

.71 

-55 

.4# 

.33 

.63 

1.80 

• 51 

2-A 

3b 

5 

t 

t 

70 

-51 

-bl 

.66 

.60 

-65 

•33 

.44 

.39 

1-91 

92 

a -a 

tl 

4 

a 

5 

69 

.62 

.61 

.66 

.78 

•70 

.64 

.44 

.42 

2.31 

» 

71-A 

b3 

t 

5 

t 

66 

-5b 

-59 

.57 

.78 

-70 

.49 

.48 

2.21 

5t 

a.* 

39 

5 

t 

t 

66 

.ta 

-55 

•58 

.71 

.61 

•5b 

.44 

.38 

•a 

2.f  * 

55 

7X-A 

b6 

5 

6 

5 

59 

.59 

-67 

40 

.78 

.a 

.45 

.44 

2.48 

56 

7X-A 

b6 

5 

6 

5 

61 

-5b 

-57 

.36 

.76 

.75 

.48 

.44 

M 

2.86 

a 

E3 

23 

4 

t 

4 

6b 

-4 

-it 

•33 

•38 

.bo 

-27 

.33 

0.S5 

56 

a-a 

16 

1 

t 

5 

6b 

•29 

.68 

.bl 

•23 

.31 

.0 

1.16 

g 

a-s 

51 

6 

6 

5 

59 

_• 

-33 

M 

.bb 

.50 

•23 

.32 

.44 

l.lb 

60 

a-s 

tl 

5 

5 

5 

5? 

.4 

-39 

.60 

.60 

•5b 

-33 

.38 

.44 

1-22 

61 

TX-8 

» 

5 

5 

6 

5b 

-b9 

.33 

.56 

.59 

.26 

.43 

.44 

1-30 

62 

7X-0 

69 

7 

6 

t 

a 

-• 

-Cl 

J6 

.51 

.t7 

.21 

.35 

.44 

0.69 

63 

TX-Y 

n 

6 

6 

6 

55 

-6b 

-58 

.68 

•58 

•72 

.44 

.53 

.35 

2.10 

ft 

E“I 

Ob 

6 

6 

5 

55 

.60 

.5* 

4 0 

•72 

•7b 

.44 

•58 

-45 

2.19 

f? 

a-t 

?1 

7 

7 

6 

55 

.te 

.49 

M 

•69 

.55 

.44 

.50 

-38 

1.78 

66 

a-A 

43 

5 

6 

5 

61 

-5b 

-5C 

4a 

•66 

.71 

.45 

.44 

•33 

2Jt7 

67 

a-A 

t2 

6 

6 

6 

b8 

.58 

-58 

4ft 

•79 

.69 

-57 

.44 

-45 

2.28 

66 

a-A 

96 

5 

5 

5 

50 

*57 

-71 

.70 

•97 

.a 

.60 

-•4 

.46 

2.60 

69 

a-s 

56 

5 

5 

5 

58 

.4 

.bl 

.39 

•60 

.53 

•39 

.43 

.44 

J-*4 

70 

•a-8 

22 

6 

6 

6 

5b 

.4 

-31 

•31 

.52 

.50 

.30 

.37 

.4* 

0.9? 

71 

a-8 

t 

6 

6 

6 

50 

.4 

-37 

.tt 

.66 

.58 

-34 

.45 

.44 

1^6 

72 

1X-Y 

86 

7 

7 

6 

a 

-56 

-52 

.ST 

.65 

.68 

.4# 

•5b 

.41 

2.03 

73 

a-t 

IOC 

6 

t 

6 

51 

•5b 

-b6 

.68 

.89 

•63 

.44 

.55 

-34 

1*91 

7b 

7X-Y 

75 

7 

6 

6 

50 

•59 

•b9 

.39 

.57 

.75 

•44 

-51 

.34 

L.A 

75 

a-A 

g 

7 

7 

6 

17 

.6b 

.63 

46 

.79 

.78 

•57 

.44 

•bl 

2.48 

76 

a-A 

61 

6 

7 

6 

b3 

.66 

-65 

.66 

•99 

.62 

•57 

•44 

•a 

2.58 

2 

a-A 

62 

6 

6 

6 

36 

-58 

•72 

.66 

•97 

•79 

.60 

.44 

•a 

2.43 

70 

79 
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APPENDIX  G 


DERIVATION  OF  DISTRIBUTION  OF  AGCT  STANDARD  SCOPES  FOR 
TOTAL  STRENGTH  POPULATION  OF  THE  ARMED 
FORCES  AS  OF  5.1  DECEMBER  1944 

The  plan  for  standardization  of  A^T-l  and  AFQT-2  required  that  norrnc  for  the  tests 
represent  percentile  scores  in  a total  potential  military  population  under  conditions  of  full 
mobilization.  In  addition,  since  the  Selective  Service  Act  of  1948  fixed  thv  minimum  accept- 
able mental  standard  for  induction  as  Army  Standard  Score  70,  it  was  required  that  scores  on 
the  new  AFQT  equivalent  to  AGCT  standard  scores  ***  established.  This  latter  objective 
could  be  accomplished  by  determining  equipercentile  equivalent  AFQT  raw  scares  to  AGCT 
standard  scores  for  the  sample  population  used  in  establishing  the  percentile  norma. 

The  major  sampling  problem  therefore  was  to  obtain  a standardization  sample  which 
represented  the  potential  military  population  under  conditions  of  full  mobilization.  It  was 
agreed  that  the  best  model  of  such  a population  would  be  the  total  military  population  at  the  time 
of  peak  mobilization  in  World  War  XL  This  was  in  December  1944.  R was  further  decided  that 
a sample  population  which  duplicated  the  AGCT  standard  score  distrP  ation  of  this  designated 
parent  population  (World  War  n military  strength)  if  generally  controlled  on  geographical  dis- 
tribution and  Service  membership,  would  serve  as  a satisfactory  sample  for  standardization. 
These  agreements  were  based  on  two  major  assumptions:  (1)  The  II,  694,229  enlisted  men  and. 
officers  in  the  Armed  Forces  as  of  December  1944  would  not  differ  in  significant  population 
parameters  from  a potential  full  mobilization  population  in  the  next  five  or  teo  mars:  H 
(2)  The  distribution  of  AGCT  standard  scores  would  suffice  as  the  major  parameter  on  which 
samples  could  be  selected  to  represent  this  population  for  standardization  purposes.  Table 
3-1  shows  the  total  military  .manpower  on  31  December  1944  as R was  distributed  among  the 
various  Services.  It  was  necessary  to  break  out  numbers  of  n y commissioned  officers 
Jrom  those  commissioned  in  schools,  as  shown,  in  order  to  r-  u ' AGCT  standard  *core 
listributions  from  data  which  were  available. 


Table  G-l.  Strength  of  Armed  Services  as  of  31  December  1944. 


1 

Service 

(1) 

ENLISTED 

MEN 

(2) 

OFFE 

Directly 

Commissioned 

(S) 

:ers 

Commission*1.' 
From  Ranks 
(4) 

Total 

Manpower 

(5) 

Army-Air  Force 

7,127,897 

220,543 

619,940 

7,968,380 

* 

Navy 

2,736,270 

293,268 

82,716 

3,111,254 

Coast  Guard 

147,885 

11,707 

480 

1 ='0,052 

Marine  Corps 

414,561 

11,995 

27,987 

454,543 

TOTAL  MANPOWER 

11,694,229 
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The  purpose  of  this  Appendix  is  tr  demonstrate  how  the  AGCT  standard  score  distribu- 
tion tor  this  total  World  War  n population  was  estimated  from  available  data. 


GENERAL  PROCEDURE 

The  general  procedure  for  estimating  the  AGCT  score  distribution  tor  the  total  Armed 
Forces  strength  was:  (1)  To  obtain  an  estimate  of  the  distribution  for  each  Service  separately; 
and  (2)  To  weight  the  distribution  for  each  Serv.  :e  in  accordance  with  its  total  strength  (as 
shown  in  Table  G»l)  and  to  combine  the  weighted  distributions. 


INDIVIDUAL  SERVICE  DISTRIBUTIONS 

As  a basis  for  building  AGCT  distributions,  very  large  samples  were  obtained  of  input 
in  the  various  Services  as  follows: 

Army-Air  Foret.  A 2 % sample  of  input  during  1944  in  5 -point  AGCT  standard  score  inter- 
vals. Since  the  Air  Force  was  a part  of  the  Army  at  this  time,  the  two  Services  were  treated 
as  one. 


Favy . Total  input  for  the  year  January  1944  to  February  1945  in  terms  of  Navy  General 
Classification  Test  scores.  These  were  converted  to  equivalent  AGCT  scores  by  means  of 
conversions  available  in  a previous  Army-Navy  Classification  Battery  comparison  study©. 

Marin*  Cor**.  Total  input  tor  the  same  period  covered  by  the  Navy  sample.  The  Marine 
Corps  used  AGCT  in  its  classification  procedures,  so  no  further  conversions  of  scores  were 
necessary. 

It  was  assumed  (hat  the  above  distributions  of  enlisted  men  input  would  ade  quately  . 
represent  tee  distribution  of  enlisted  men  and  officers  commissioned  from  the  ranks  as  of 
base  period,  December  1944,  since  this  portion  of  the  population  had  come  from  s ich  input  in 
the  past.  Therefore,  the  single  correction  of  each  distribution  tor  officers  directly  ew?!'’* 
stoned  from  civilian  life  was  necessary  ~>  give  a representative  distribution  of  strength  as  ox 
December  1944. 


CORRECTION  fOR  DIRECTLY  COMMISSIONED  OFFICERS 

Data  from  Table  G-l  gave  the  basis  tor  applying  corrections  to  each  into-  -al  of  each 
Service  input  distribution  in  order  to  account  tor  directly  commissioned  officers.  The  follow- 
ing  proportions  of  directly  commissioned  officers  to  enlisted  men  were  derived  from  the  table: 

Army  Air  Force:  7, 127, 897  enlisted  men  to  220, 543  directly. 

commissioned  officers,  or  .969987  to  .030013. 

Navy:  2, 735,270  to  293,268,  or  .903165  w .096885. 

Marine  Corns;  414,561  to  11,995,  or  .971879  to  .028121. 

Therefore,  the  input  distributton  of  each  Service  we-  cast  into  a percentile  dtetrE-  tion, 
which  in  turn  was  applied  to  the  above  proportion  of  enlisted  men  to  show  the  proportional  dis- 
tribution of  enlisted  men.  These  distributions  are  shown  in  Table  G-2,  column*  (2),  (5J,  and 
(8).  To  these,  the  directly  commissioned  officers  were  added  with  their  total  proportion  being 
distributed  in  intervals  at  standard  score  110  and  above  in  the  same  ™ armor  as  was  the  distri- 
bution of  enlisted  men.  The  assumption  here  was  that  officers  would  be  distributed  like  a ran- 
dom selection  of  enlisted  men  with  standard  scores  lip  and  above  (assuming  directly  commis- 
sioned officers  to  be  sin.  ..  x in  quality  to  those  selected  from  the  ranks  where  AGCT  standard 
score  110  is  required  fox-  OCS). 
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Table  G-3.  Proportional  distribute 

of  AGCT  standard  scores  for  total  strength  of 

Armed  Forces  as  of  31  December  1944. 

AGCT 

* j 

Standard 

Smoothed 

Score 

Army-Air  Force 

Navy 

Marines 

Total 

Cumulative 

Percentiles 

< 

(1) 

<2) 

(S) 

J4) 

(5) 

(6) 

(7) 

i j 

*. 

160  and  up 

.000020 

.001199 

.000004 

.001223 

1,000000 

.100  1 

165-169 

.000271 

.000636 

,000012 

.C,fX)919 

.998777 

.100 

j 

150-154 

.001449 

.001410 

.000041 

.002900 

.997858 

.100 

? ‘15-149 

.002899 

.003345 

.000086 

.006330 

.994958 

.100 

•4 

■4 

• 

140-144 

.006523 

.004466 

.000168 

.011167 

.988628 

.99 

135-139 

.012321 

.007640 

.000438 

.020399 

.977471 

.88 

4 

130-134 

.018989 

.008x86 

.000867 

.027842 

.957072 

.96 

125-129 

.034064 

.020319 

.001869 

.056052 

.926280 

, .92 

120-124 

.046965 

.018632 

.002627 

.068224 

.873178 

.87 

115-119 

.049284 

.025002 

.003781 

.078067 

.804954 

.80 

» 

110-114 

.059432 

.033328 

.004754 

.097514 

.726887 

.73 

105-109 

.056709 

.022608 

.005523 

.084840 

.629373 

.63 

100-104 

.053008 

.028291 

.003906 

.085205 

.544533 

.55 

95-99 

.049042 

.026565 

.004170 

.079777 

.459328 

.47 

90-94 

.043622 

.022821 

.003000 

.069503 

.379551 

.37 

85-89 

.042961 

.016189 

.002618 

.061768 

.310043 

.30 

80-84 

.037013 

.014347 

.001534 

.052894 

.248280 

.26 

• 

75-79 

.034369 

.009598 

.001383 

.045350 

.195386 

.20 

70-74 

.031196 

.007355 

.000990 

.039541 

.160036 

15 

. 1 

85-69 

.026437 

.0019n 

.000678 

.029102 

.110495 

i £ 

60-64 

.022472 

.002338 

.000423 

.025733 

.081398 

.9 

55-59 

.018507 

.001894 

.000177 

.020578 

.055680 

.6 

‘ 

50-54 

.013219 

.000555 

.000060 

.013834 

.036082 

.4 

| 

45-49 

.008394 

.000353 

.000038 

.008785 

.021248 

.2 

40-44 

.012228 

.000171 

.000064 

.012463 

.013463 

V 

• 

i 

i 

TOTAL 

.681384 

.279737 

.038860 

1.000000 

- 

l| 
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' f 

■ ■ \ 
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4 
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toothed 

eeutiles 
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Gentiles 
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’00 

00 

00 

•'00 
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•8 

16 
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'3 
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.6 

17 

17 


10 
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Table  G-2,  columns  (3),  (6'  and  (£),  shows  the  corrections  for  directly  commissioned 
officers.  Columns  (4),  (7),  and  (10$,  show  the  total  (sum  of  preceding  two  columns  in  each 
. case)  proportional  distribution  for  each  Service.  These  total  proportional  distributions  were 
then  taken  to  represent  AGCT  standard  score  distributions  for  total  strength  In  each  Service 
as  of  December  1944. 


COMB ININS  ALL  SERVICES 


The  individual  percentage  distributions  for  each  Service  were  combined  to  yield  a per- 
centage distribution  of  AGCT  standard  scores  for  the  total  Armed  Forces  strength  as  shown  in 
Table  G-3.  Each  Service?  proportion  of  the  total  strength,  11, 694, 229  men,  was  derived  by 
dividing  this  number  into  its  total  strength  as  shown  Table  G-l.  These  proportions  were  as 
follows:  Army-Air  Force,  . 681394;  Marine  Corps,  .038869;  Navy,  .279737.  The  Navy  pro- 
portion included  the  Coast  Guard  strength,  shown  in  Table  G-l  on  the  assumption  that  the  Coast 
Guard  personnel  would  be  distributed  in  approximately  the  same  manner  as  the  Navy.  For  each 
Service,  this  proportional  figure  was  multiplied  by  the  total  percentage  in  each  AGCT  interval 
as  shown  in  Table  G-2  to  give  the  proportion  of  the  total  Armed  Forces  strength  in  each  inter- 
val. The  interval  percentages  for  each  Service  were  then  added  to  yield  the  total  Armed  Forces 
distribution  shown  in  column  (5)  of  Table  G-3.  These  were  cumulated  In  column  (6)  to  show  the 
AGCT  percentile  norms  for  the  World  War  n full  mobilization  population. 


As  a further  extension  the  cumulative  percentages  were  plotted  and  fitted  to  a smoothed 
ogive.  Column  (7)  shows  the  final  norms  derived  from  the  smoothed  ogive.  Subsequent  sam- 
ples used  in  standardizing  AFQT  were  selected  to  duplicate  this  smoothed  ogive  curve. 
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APPENDIX  E 


'■'c^vEr^i'X:  ?abli2:  *aw  access  on  afot-i  or  afqt-2  to  percentile  scores 

AND  TO  ARMY  STANDARD  SCORES 


Raw  Score 

Percentile 

Standard  Score 

Raw  Score 

Percentile 

Standard  Score 

90 

100 

154 

45 

28 

86 

89 

100 

157 

44 

27 

85 

88 

100 

151 

43 

26 

84 

87 

100 

146 

42 

24 

83 

86 

99 

142 

41 

23 

82 

85 

98 

139 

40 

22 

61 

84 

97 

■137 

39 

21 

80 

83 

96 

134 

38 

20 

79 

82 

95 

131 

37 

19 

78 

81 

93 

130 

36 

18 

77 

80 

92 

128 

35 

17 

76 

79 

90 

126 

34 

16 

76 

78 

89 

125 

33 

15 

73 

77 

87 

123 

32 

14 

71 

78 

85 

122 

31 

13 

70 

75 

84 

121 

30 

12 

69 

74 

82 

120 

29 

12. 

68 

73 

80 

118 

28 

11 

66 

72 

78 

117 

27 

10 

65 

71 

76 

116 

26 

9 

64 

70 

74 

115 

25 

9 

63 

69 

73 

114 

24 
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62 

68 

71 

113 

23 

7 

*■< 

Vi 

67 

69 

112 

22 

7 

60 

66 

87 

111 

21 

6 

59 

65 

85 

110 

20 

5 

57 

64 

63 

109 

18 

5 

56 

63 

61 

107 

18 

4 

55 

62 

59 

106 

17 

4 

CO 

81 

67 

105 

16 

C 

V 

W-. 

60 

55 

104 

16 

2 

50 

59 

58 

103 

14 

2 

48 

58 

51 

101 

13 

2 

47 

57 

49 

100 

12 

2 

45 

56 

47 

99 

li 

2 

43 

55 

45 

88 

- 10 

2 

42 

54 

43 

97 

q 

2 

4T- 

53 

41 

96 

8 

1 

41 

52 

39 

95 

7 

1 

41 

51 

37 

94 

6 
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40 

50 

36 

93 

5 

1 

39 

49 

34 

92 

4 

1 

39 

48 

32 

91 

3 
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39 

47 
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1 

89 

46 

30 

88 

1 

1 

39 
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